Translation Is Pervasive
1 points • 0 comments
From 3/25/2018, 2:16:13 PM till now, @paubric has achieved 61 Karma Points with the contribution count of 23.
Recent @paubric Activity
Translation Is Pervasive
1 points • 0 comments
Diffusion magnetizes manifolds (DALL-E 2 intuition building)
1 points • 0 comments
context: interned at UiPath on a team doing ML
A lot of unnecessary hype indeed, though you can actually throw in occasional OCR/NER blocks in the workflow you design. Most often it's common NLP tasks on documents being moved around, but also forecasting, etc.
Thanks for the interest! What I've been playing with in the project is not really a summarizer, though I think it could work well with a summarization model. This one is simply about highlighting relevant bits of a certain text (be it the original or a summary), so that you can then "scan" it in interesting ways.
I totally agree that people's summaries of the same original article are different, and moreover that the best summary for you personally is different from mine, and different from anyone else's. It should not tell you what you already know, for instance. But again, whole different can of worms..
What it says on the tin: BERT attention per token gets converted to CSS styling to grab the user's attention. The idea came from thinking about content consumption as perceiving large bodies of information, rather than searching for a specific document. In this framing, attention becomes a building block of a "perceptual engine," which itself has a few nice-to-haves: abstraction, top-down influences, interactivity. More details in the write-up/demo, I'll stick around to answer questions.
Show HN: Cybersalience – Guiding user attention using transformer attention
3 points • 3 comments
Hi there! The first few paragraphs say it all: I was curious about how we might train a virtual assistant to actually help the user reach conclusions, rather than simply throw in the right answer. While the outcome is far from perfect, it was pretty fun working with reinforcement learning for the first time. Will hang around to answer eventual questions.
Show HN: Oneironomicon – Training collaborative AI on dreamed-up users
2 points • 1 comments
I had similar doubts initially. I went for the name they used in the paper: https://arxiv.org/pdf/2102.05169.pdf. It takes the sentence out of its original context. A "contextualizer" might instead turn a transcluded paragraph included as is in different notes and integrate it more coherently in the context.
Thanks for the thoughtful reply! If I understood correctly, you argue that the lexiscore might still lead you down a rabbit hole, even if through ever more challenging ideas, rather than helping you find nutritious content outside that echo chamber.
If you're really "skilled at" or fluent in the flat earth perspective, then content which presents a wildly different perspective on the same topic would get labeled as highly nutritious (or at least this was the intention). I understand your concern, and see how "advanced" content from the same ideological background might make it there, though I think there must be a point where you can't get more advanced at it, following which related content will fall in the boring sector, leaving the conflicting takes in the sweet spot (in theory).
It's also relevant to highlight something mentioned in the discussion towards the end, that actively moving across the diagonal channel (i.e. consuming both challenging content in a familiar topic, but also accessible content in unfamiliar topics) might be a further improvement, though this is not explicitly implemented in this initial version of the thing.
I'm really glad you like it! It should soon get its biggest update yet, related to sharing features / learning in public. More details at the end of this last article: https://paulbricman.com/reflections/thinking-in-public
tldr: You add content in a self-hosted web app (e.g. as RSS, PDF, EPUB...), and it measures how interesting it is by trying to reconstruct it with a GPT-3-like model and seeing how predictable it is.
Indeed, the page is more of a write-up than a landing page. Also, broken link fixed, thanks!
It could work if the extension called on a server to do the heavy lifting of reconstructing the text with the language model. I have a feeling that trying to do it all in the browser today would essentially mean implementing non-trivial parts of HuggingFace's transformers in Javascript + tensorflow.js or trying to compile PyTorch to WASM or something. Not the most enjoyable tasks, eh..
Thanks for the kind words! I totally agree, I feel we'll look back at how we currently find online content in a few years from now and wonder how could we possibly be okay with it.
Hi all, I started experimenting with this new approach to finding valuable content online after I got frustrated with traditional recommendation systems leading to echo chambers and all sorts of other nasty failure modes listed in the write-up. Unlike traditional recommendation systems, the lexiscore uses a language model which has access to your written notes. It estimates how surprising different content items are for you based on the model's perplexity in reconstructing the texts. This way, you can quickly tell which articles are likely to be too boring or too challenging and narrow in on the sweet spot of balanced skill and challenge.
Show HN: Lexiscore – a nutritional label for food for thought
66 points • 12 comments
That would be a neat little study for one of those people with a casual Obsidian vault of 80K notes:
1. Randomly pick 2 notes 2. Compute semantic distance and minimum traversal distance through the graph (Linkedin 3rd degree connection style) 3. Get a scatter plot and compute the correlation
If you're one of these knowledge power users ping me by email if you want to collaborate on this!
The conceptarium (currently) uses OpenAI's CLIP model which encodes both texts and images into a semantic space so that related items are close to each other. This means that if you have a picture of a bike pump and a written note about bike pumps, then they'll get similar hashes. This means that you can search for written notes on bike pumps by taking a picture of a bike pump (Google Lens style) and also search for pictures using text queries, or a mix.
And oh man a dream of mine as an expat in the Netherlands is to learn how to take my bike apart and put it back together to better understand how it works and how to fix it myself. Though this idea never made it too high in my priorities list.
Thanks for the thoughtful bit of commentary! It's true, my disproportionate focus on embeddings and no explicit links might have been just an extreme reaction to the current state of tools for thought which are 99% graph-based. Was curious to see what would happen if you only relied on those. Though I agree that explicit links are sometimes handy and can code useful information, so I'm also excited to see more of this interplay developing in the future.
Unconference on Building Tools for Thought
3 points • 0 comments
site design / logo © 2022 Box Piper