Hacker News Re-Imagined

Thinking in an array language

  • 141 points
  • 9 days ago

  • @tosh
  • Created a post

Thinking in an array language


@carapace 8 days

Replying to @tosh 🎙

I was just playing with Nils M Holm's Klong this morning: https://t3x.org/klong/index.html (Klong rather than the others mostly because the C implementation looks like C so I have a ghost of a chance of actually grokking it.)

These folks are really onto something, but I think they get sidetracked in the (admittedly very very fun) minutia of the languages and lose sight of the crucial insight in re: mathematical notation, to wit: it's a means of human communication.

For APL or K to get out of their niches would require, I am convinced, something like a tome of documentation of a ratio of about 1.5 paragraphs per line of code. That would give us mere mortals a fighting chance at grokking these tools.

A similar problem plagues the higher-order stuff they're pursuing over in Haskell land. I know e.g. "Functional programming with bananas, lenses, envelopes and barbed wire" and "Compiling to Categories" are really important and useful, but I can't actually use them unless some brave Prometheus scales Olympus and returns with the fire.

Stuff dribbles out eventually. Type inference and checking have finally made it into the mainstream after how many decades?

Reply


@snidane 8 days

Replying to @tosh 🎙

Array notation is great, but only for single array operations and for dense array operations.

The moment you need to run complex group bys and store non-contiguous data efficiently, it gets awkward pretty quick.

On the other hand, operations on dense data is pretty cumbersome in SQL. You can only do so much with limited support of proper scan algorithms, merge joins or bolted on window functions.

Please somebody combine APL with SQL and you win the programming language wars.

Reply


@ogogmad 8 days

Replying to @tosh 🎙

Array languages might make good maths notation. It's terse and easy to write, and there's a logical naming scheme (for instance, matrix multiplication is just (+/*)\: ). I suppose the trick is to think of (+/*)\: as one unit.

Reply


@btheshoe 8 days

Replying to @tosh 🎙

This all seems very similar to writing vectorized code in numpy

Reply


@geophile 8 days

Replying to @tosh 🎙

Sort of related: thinking in relational algebra, or SQL. It appears to be "natural" to think about computing one atomic value at a time, in loops, or slightly less intuitively, recursive functions. (That latter choice may follow from whether your first language was pure Lisp.)

I was fortunate to have a teacher whose database course drilled relational algebra into us. This was in the 70s, shortly after Codd's paper, and well before SQL was invented, much less established. Now I think about much computation algebraically (and often functionally). But I do see that this is "unnatural" for many, including students, having taught databases for several years.

SQL reflects this. I often see students writing nested subqueries, because that is more procedural, where joins would be a cleaner choice. A colleague of mine wrote a paper many years ago, pointing out that thinking procedurally is more "natural" for many: https://dl.acm.org/doi/10.1145/319628.319656. But thinking set-at-a-time instead of one-at-a-time is a valuable skill, not that far off from thinking functionally.

Reply


@IshKebab 8 days

Replying to @tosh 🎙

"What if we could make entire programs as unreadable as regexes?" -K

Reply


@unnouinceput 8 days

Replying to @tosh 🎙

Any performance benchmarks between a simple C implementation of matrix multiplication like explained in the article and K one?

Story time: 2 years ago, right before 1st lockdown, I landed a client. A data scientist who had already implemented an algorithm which dealt with matrices, implemented by a previous programmer he hired. Said implementation was in Python, no more than 10 line altogether, which performed well when matrix size where small, like 10x10. But the problem was, his real work need it to have matrices of size 10^6 x 10^6. Not only the Python implementation had to be ran on a beast of server with 4TB of memory, it also took 3 days to finish. And while the algorithm was small in size in Python, its explaining paper was 4 pages in total, which took me 1 week to understand. And then an entire month to implement in C. But in the end, when all was said and done, the run time when used with real data was only 20 minutes and consumed only 8 GB of memory, though it did required at least 16 virtual processors.

Hence my question, in the end performance is what it matters, not number of lines.

Reply


About Us

site design / logo © 2022 Box Piper