@bn

rss

avatar

ok - so this evening i trained a Sparse Encoder (SAE) On nanochat-d20 (using TopK). There were a bunch of cool features found, but my favorite is the research feature:

#27 Feature 3931 (max=0.40)
[0.40] has awarded up to €1 billion in[** research**] funding for
[0.38] Paper Simple Ways To Write A Literary[** Research**] Paper
[0.35] dupunks' Guide: How to Do[** Research**] Online

If you prompt the bot - like tell it:

"You are a cupcake"

Then setting strength of that feature to -0.5 warps its weights and you get a super cosy response:

Welcome to my humble cupcake, but there's a bit more festive flair to my show this time of year. When you're filled with freshly baked apple cider, I'll be there with you to make sure your warm, cozy home's always top of the bed, with a warm and welcoming holiday!

And if you crank the strength up to 0.5 - you warp into the science response:

  1. Collect and question
  2. Form a hypothesis
  3. Design experiments
  4. Implement
  5. Testing results

Github repo


comments

no comments yet.

back