thoughts.txt [慧慧]

i blunder words and overthink


words don't come easy to me -- especially in the LLM era

when talking to LLMs, i no longer bother to write in full sentences, nor do i even attempt to express myself artically. i dont even fix my typos. even before i get stuck it spoonfeeds me suggestions on the next possible word. more and more, reading non-LLM articles feel exhausting and my attention careens. words don't come easy

but i can tell. i cant tell if an article is necessarily LLM-generated, but i can tell when it's not.

for the past decades, literacy rates were a part of measuring a country's level of development. in the LLM era, why do words still matter?

  • "Humans achieve agency by composing their lives in their own language.": LLMs produce words, but not meaning. meaning comes when the words are your own, able to produce new experiences and ideas and "activates fresh existence" when spoken to someone else, to yourself, to some god. words form narrative and empowerment. to expand your vocab with new imagery, metaphors, is to allow you to explicate how you feel, what you think and believe more precisely. you build relationships with your words and they form a personal understanding with you. it's like when we used to learn a new SAT word and would stuff it into sentences. inorganic at first, but eventually you'd gain more nuance on how the word fits into your expression.
  • words as historical artifacts: words aren't just words, they are a history. as someone who grew up in hong kong speaking english as my first language, i recognise my relationship with language stems from a colonial legacy, a british empire. with LLMs, history is conflated into a monolithic archive of training data. as Alain Mabanckou asks: "AI may well be able to take on board these cultural elements, but can it also reproduce the suffering of these oppressed peoples?"
  • beauty, resonance, aesthetic: i write poetry. a part of poetry is surprise, to draw an unfamiliar connection between two things (within the constraint of what 'makes sense', however that's defined, but even this can be challenged e.g. by surrealists) -- the churning washing machine and the cycle of poverty, the moon and a belly button. LLMs predict predictable predictions. the next most probable token. words are not just tools, but things that draw beauty.

memory

we've been trying to solve memory in neural architectures since the 50s, starting from symbolic memory (SOAR architecture) to RNNs and LSTMs with gates to control information flow, then the legendary "Attention is all you need" paper in 2017 introducing a context window as memory (-> context rot, lost in the middle), and now finally external memory banks (RAG, graphs, scratch pads). catastrophic forgetting continues. labs rush to solve it.

  • memory is not a bag of facts, it's layered control: it's not just about storing information, it's about being selective about what to remember and what to forget. pruning and forgetting are equally important when it comes to memory
  • memory != continual learning: learning isnt just remembering and is never static ofc. LLMs are static post-pretraining and limited to in-context learning, lacking the ability for adaptation and "neuroplasticity"
  • formation of long-term memory involves two consolidation processes: online (synaptic) consolidation soon after learning where new info is stabilised and being transferred from ST to LT storage, then offline (systems) consolidation, replaying recently encoded patterns with sharp-wave ripples in the hippocampus

sources & more

  1. SOAR: arxiv.org
  2. LSTM: ieeexplore.ieee.org
  3. attention is all you need: arxiv.org
  4. memory from then to now: huggingface.co
  5. nested learning: research.google

what should last?

i enjoyed this article whilst having my morning coffee:

lil.law.harvard.edu
  • 'Fragility, and the culture it creates, can be an asset in inspiring the sort of care necessary for the long term. A system that seeks the indestructible or infallible has the potential to encourage overconfidence, nonchalance, and the ultimate enemy of all archives, neglect.'
  • 'The success of century-scale storage comes down to the same thing that storage and preservation of any duration does: maintenance. The everyday work of a human being caring for something. [...] How it is stored will evolve or change as it is maintained, but if there are maintainers, it will persist.'

as someone who likes to document everything, but rarely goes back to read it, i often question what the value of preservation is. search requires you to remember what to search for. scrolling requires time. all the bits of myself floating in the digital void. id be curious to fine-tune an LLM with all my journal entries, notes, images, thoughts, to see whether it can evolve into a mini-me.

china-us decoupling, on the ground

i first heard the phrase "china-us decoupling" as a junior at berkeley [1]. that was 6 years ago. since then, the phrase has been tossed by media outlets and gnawed on by economists and friends [2,3 to show a few examples]. i never thought too much about it, but recently moving between hk/shenzhen and US, i feel like i have to adopt a new set of lingo, learn two separate UI flows, download two sets of apps, with each place:

  • seeing double in my homescreen: consumer apps like google maps vs gaode (alibaba), whatsapp/messenger vs wechat, foundational AI (deepseek, kimi, doubao, openai, anthropic), electric cars and home products (biyadi, xiaomi, xiaopeng, weilai, lixiang, vs tesla/lucid). last year i came home and the roads were packed with teslas, now i see more biyadis. used to using google maps, the gaode flow felt confusing and awkward to me. i learnt to buy meituan coupons for discounts at restaurants before checking out, and topping up my weixin/alipay wallet.
  • consumer products: i went to decathalon in shenzhen to buy protein bars and couldn’t find PhD, barbells, or any international brand, just local Chinese ones.
  • twin markets finding parity: talking to my dad this morning, he shared alibaba's origin story, starting from replacing yellow books as a registrar, to becoming an ecommerce marketplace that started the same way amazon did, as a bookstore. on the flip side, saw on x that livestream sales in clothing stores beginning to grow in US, a trend started in china. two separate markets learning from each other.
  • data decoupling, the beginnings: alibaba is growing its cloud business, competing against AWS. many multinationals in china still use AWS, but with the decoupling of foundational LLM models and their training data, will we move up the chain to even decouple all that we store in the cloud?

sources & more

  1. The Great Decoupling and Sino-US Race for Technological Supremacy: youtube.com
  2. tyler cowen: marginalrevolution.com
  3. reuters.com
  4. aspistrategist.org.au

robots in the wild

+1 follow-up

massive traffic, and my dad chuckles to himself and says "must be a self driving car glitching out up front". he turns to my mum: "with the way you drive, maybe it's trained on your data".

thinking about the "long tail" of edge cases that self driving cars need to be trained on and the success it can have in china:

  • crazy drivers, unlearnable behaviour? with the way folks drive in shenzhen + traffic rules (more likely not followed than followed), it feels too "unpredictable" for an automated system to navigate
  • protecting workers: even if successful, the government may not be extremely open to mass deployment as it takes economic jobs in an already high unemployment market. [3, 5, 6]
  • chinese autonomous vehicle market: with baidu's apollo go, weride, pony ai, all in testing phase but on international roads: singapore, abu dhabi, dubai. why not chinese roads? too difficult to learn, or additional regulations? [2]

other robot things:

  • humanoid vs non-humanoid: make humanoid robots as most of society is shaped by the human form. but it's inefficient. why build a humanoid robot to drive a car when you can just build a self-driving car?
  • imo china will just dominate. anything that requires hardware and materials, china just has such a strong competitive advantage with the manufacturing base.

sources & more

  1. learning through putting robots into the wild: x.com
  2. theguardian.com
  3. 'Waymo has quickly captured more than 10% of the SF ride-sharing market': x.com
  4. x.com
  5. youtube.com
  6. '2024, the number of licensed ride-hailing drivers nationwide had reached 7.48 million — a figure roughly equal to the entire population of Hong Kong': beijingscroll.com
  7. '77% of drivers entered the ride-hailing sector after being unemployed': hr.asia

the ai application layer bloat. winners?

foundational models are slowing down, the ai product space is beginning to grow. there's a lot of noise, a lot of "chatgpt" wrappers, 20 YC companies doing a similar thing: medical advice ai, loveable for x, palantir for y, etc. yet not one has stood out yet. we're in a space of high-competition. i see so many twitter posts of new company launched announcements, but then never hear about them again. what will stick?

  • right now: they lack trust loops and they don't solve real pain points, sure there is a novelty but feels like people try them once then leave
  • ai as a feature, not a destination: companies that already embed customers into their workflows, e.g. notion, figma, salesforce, will win by absorbing ai into existing user behaviour rather than asking users to switch tools. if we built an ai product as standalone it either has to be 1) revolutionary for a current workflow; 2) currently unsolved / manual in a niche, narrow market
  • unique data as competitive advantage: e.g. an ai coach who knows your habits knows your preferences, progressions, and can tailor education to you;
  • high-trust verticals: domains that like reliability, accuracy, emotional intelligence, the human touch, e.g. law and healthcare may adopt slower
  • how much do people actually use AI? am i just in a bubble??

sources & more

  1. x.com
  2. ycombinator.com
  3. like just being on twitter in general, and if you look at any VC portfolio
  4. youtube.com

genai homogenising thought and power, internet as abundance and connection

the internet allowed for massive decentralisation of voices, opinions, perspectives, but also infinite connection for folks using network effects. you can find blogs with 1 view, videos from the 2000s of a skater kid, tumblr journal entries, recipes half written... fragmented by design where the long tail thrived. genAI is now doing the opposite. no one goes to google for search, they go to chatgpt for their query and chatgpt uses the "top 10 relevant results", summarises and blends the individual authors' voices into the murky brown of mixed paint, and provides you the result. it abstracts and regurgitates the mess. likely you won't fact check it. likely you won't click into the sources. what does this mean for us? what does this mean for the internet? the internet is no longer a place for us to connect, and voices of the "influential", the ones with higher SOE, higher views, to get control. i mean in a way the internet is a reflection of society.

  • openai, anthropic beginning to create these product platforms: e.g. you can book hotels directly in chat, you can make a list of shopping options, plan your tour. pure productivity optimisation, convenience brainrot, let us do this for you, offload your cognition, full personal assistant, or just a person, you?
  • Kierkegaard warns: "eventually human speech will become just like the public: pure abstraction—there will no longer be someone who speaks, but an objective reflection will gradually deposit a kind of atmosphere."
  • smooth communication, texture of human communication gets lost, the typos, tangents, style, what tech folks like to call "taste"
  • return of the retro: unoptimised spaces, in-person interactions, grassroots community building. think the old myspace, or facebook in 2010s where we posted on one other's walls. strava-style connection is what we're looking for: small circles, close friends, un-curated, authentic

sources & more

  1. not directly, but made me think of byun chul han and his thoughts on digital society, a good summary here: rhizomes.net
  2. been a while since i read trisha low's socialist realism, but the way she describes the internet then -- that era of myspace, neon colours and raw html
  3. additional thought: at one point all internet was opensource, until paywalls, auth, password protection etc. were introduced as we publicised our selves.

hello world

you know somedays i realise my brain is truly empty and there is nothing there, because i dont give it the space to breathe. i execute ny 9-6, exercise, eat, sleep, and recycle. im scared to be bored, and because of this i fill free time with noise -- another youtube documentary, podcast, the occasional insta reels void. this is my accountability tracker. no fancy designs, just raw text.

hello world, i live.