Important pages are Circuits Updates - January 2024 A Collection Of Small Updates From The Anthropic Interpretability Team., Circuits Updates - July 2023 A Collection Of Small Updates From The Anthropic Interpretability Team. and Towards Monosemanticity Decomposing Language Models With Dictionary Learning Bricken Et Al. 2023 Using A Sparse Autoencoder We Extract A Large Number Of Interpretable Features From A One-Layer ... In the following table you'll find the 10 most important pages of Transformer-circuits.pub:
# | Description | URL of the website |
---|---|---|
1. | Circuits Updates - January 2024 A collection of small updates from the Anthropic Interpretability Team. | /2024/jan-update/index.html |
2. | Circuits Updates - July 2023 A collection of small updates from the Anthropic Interpretability Team. | /2023/july-update/index.html |
3. | Towards Monosemanticity Decomposing Language Models With Dictionary Learning Bricken et al. 2023 Using a sparse autoencoder we extract a large number of interpretable features from a one-layer .. | /2023/monosemantic-features/index.html |
4. | Circuits Updates - May 2023 A collection of small updates from the Anthropic Interpretability Team. | /2023/may-update/index.html |
5. | Interpretability Dreams Our present research aims to create a foundation for mechanistic interpretability research. In doing so its important to keep sight of what were trying to lay the foundations .. | /2023/interpretability-dreams/index.html |
6. | Privileged Bases in the Transformer Residual Stream Elhage et al. 2023 Our mathematical theories of the Transformer architecture suggest that individual coordinates in the residual stream should have .. | /2023/privileged-basis/index.html |
7. | Toy Models of Superposition Elhage et al. 2022 Neural networks often seem to pack many unrelated concepts into a single neuron - a puzzling phenomenon known as polysemanticity. In our latest .. | /2022/toy_model/index.html |
8. | Softmax Linear Units Elhage et al. 2022 An alternative activation function increases the fraction of neurons which appear to correspond to human-understandable concepts. | /2022/solu/index.html |
9. | Distributed Representations Composition & Superposition An informal note on how distributed representations might be understood as two different competing strategies - composition and superposition - .. | /2023/superposition-composition/index.html |
10. | Mechanistic Interpretability Variables and the Importance of Interpretable Bases An informal note on intuitions related to mechanistic interpretability. | /2022/mech-interp-essay/index.html |
The HTML pages were created with the latest standard HTML 5. The website does not specify details about the inclusion of its content in search engines. For this reason the content will be included by search engines.
IP address: | 18.155.145.42 |
Number of websites: | 4 - more websites using this IP address |
Best-known websites: | 888sport.es (little known) |
Language distribution: | 75% of the websites are english, 25% of the websites are spanish |
Webserver software: | AmazonS3 |
Load time: | 0.18 seconds (faster than 92 % of all websites) |
HTML version: | HTML 5 |
Filesize: | 11.76 KB (597 recognized words in text) |
The website doesn't contain questionable content. It can be used by kids and is safe for work.
Attribute | Classification | |
---|---|---|
Google Safebrowsing |
Safe | |
Safe for children |
||
Safe for work |
||
Webwiki rating |
No ratings | |
Server location |
USA | |
Trustworthy 85% |