ongoing by Tim Bray

ongoing fragmented essay by Tim Bray

Lounge Penguin 23 Jun 2024, 7:00 pm

Lounge, as in a jazz club. Penguin, as in GoGo Pengin, a piano/bass/drums trio. We caught their show at Jazz Alley in Seattle last week. Maybe you should go hit a jazz lounge sometime.

What happened was

My daughter turned eighteen and graduated high school. She had heard that Car Seat Headrest was playing Seattle’s Woodland Park Zoo, and could tickets and a road trip (me buying and driving) be her present? Seemed reasonable, and she found a friend to take along. I wouldn’t mind seeing the Headrests (decent indie rock stuff) but her party, her friend. I noticed that GoGo Penguin was playing Seattle’s Jazz Alley, and Lauren was agreeable to coming along for the ride and the show.

I only know about GoGo Penguin because YouTube Music drops them into my default stream now and then. I’d thought “sounds good, maybe a little abstract”, couldn’t have named a song, but hey.

The “Jazz Club” concept

You’ve seen it in a million old movies, and the Vic Fontaine episodes of ST:DS9. The lights are low, the audience is sitting at tables with little lamps on them, the band’s on a thrust stage among the tables, there’s expected to be a soft background of clinking glasses and conversation. Some people are focusing in tight on the music, others are socializing at a respectfully low volume.

Of course, usually a gunfight breaks out or an alien materializes on stage… no wait, that’s just on-screen not real-life.

All jazz clubs serve alcohol — fancy cocktails, natch — and many will sell you dinner too. Dimitriou’s Jazz Alley in Seattle is a fine example.

GoGo Penguin at Demetriou’s Jazz Alley in Seattle

GoGo Penguin at Jazz Alley; June 20th, 2024.
Our table was in the balcony.

We had a decent if conventional Pacific-Northwest dinner (crab and halibut), with a good bottle of local white. They’ve got things set up so most people have finished eating by the time the music starts. The seats were comfy. The decor was pleasing. The service was impeccable. I felt very grown-up.

GoGo Penguin

They’re three youngish guys from Manchester. Their Web site says they’re an “emotive, cinematic break-beat trio”. OK then. Piano/bass/drums is the canonical minimal jazz ensemble. Only they’re not minimal and it’s not jazz. I guess if you redefined “jazz” as complex rhythmically-sophisticated music featuring virtuoso soloing skills, well yeah. Damn, those guys can play. But their music is heavily composed, not a lot of opportunities for anyone to stretch out and ride the groove.

And it ain’t got that swing; can it still mean a thing?

I guess so, because I enjoyed myself. There wasn’t a microsecond that was boring, plus the arrangements were super intelligent and kept surprising me.

But most of all, the bass. Nick Blacka hit me harder than any bassist since I saw (and blogged!) Robbie Shakespeare of Sly and Robbie in 2004.

It’s really something special. It may be a stand-up acoustic bass, but it’s wired up so he can dominate the band’s sound when he reaches back for it (which he does neither too little nor too much). Plus the instrument’s acoustic texture roars out entirely unmarred, you can feel those strings and wood in your gut. He moves between bowing and plucking and banging and you hardly even notice because it’s always the right thing.

I don’t wanna diss Chris Illingsworth on piano or Jon Scott on drums; both of them made me catch my breath. But it’s Blacka’s bass explosions that I took home with me.

That swing?

These days my musical obsessions are Americana (i.e. bluegrass with pretensions) and old blues. The first of which also features instrumental complexity and virtuosity. And, if I’m being honest, both offer a whole lot more soul than Penguins.

I respect what the they’re doing. I’ll go see them again. But I wish they’d get the hell out from behind those diamond-bright razor-sharp arrangements and just get down sometimes.

Next?

Lauren and I had real fun and left feeling a bit guilty that we’ve been ignoring Vancouver’s own jazz clubs. Not that I’m going to stop going to metal or post-punk or baroque concerts. But jazz clubs are a good grown-up option.

Epsilon Love 17 Jun 2024, 7:00 pm

Quamina was for a time my favorite among all my software contributions. But then it stalled after I shipped 1.0 in January of 2023. First of all, I got busy with the expert witness for Uncle Sam gig and second, there was a horrible problem in there that I couldn’t fix. Except for now I have! And I haven’t done much codeblogging recently. So, here are notes on nondeterministic finite automata, epsilon transitions, Ken Thompson, Golang generics, and prettyprinting. If some subset of those things interests you, you’ll probably like this.

(Warning: if you’ve already had your hands on the theory and practice of finite automata, this may all be old hat.)

[Update: This is kind of embarrassing. It looks like what this post refers to as an “epsilon” is not the same epsilon that features in the theory of finite automata. I mean, it still works well for where I’m using it, but I obviously need dig in harder and deeper.]

Sidebar: What’s a Quamina?

I don’t think there’s much to be gained by duplicating Quamina’s README but in brief: “A fast pattern-matching library in Go with a large and growing pattern vocabulary and no dependencies outside Go’s standard libraries.” If you want much, much more, this Quamina Diary blog series has it.

The problem

Combining too many patterns with wild-cards in them caused Quamina 1.0’s data structures to explode in size with a growth rate not far off the terrifying O(2N), which meant that once you’d added much more than 20 patterns you couldn’t add any more, because the add-pattern code’s runtime was O(2N) too.

Those structures are state machines generally, “nondeterministic finite automata” (NFA’s) in particular. Which offer good solutions to many software problems, but when they get to be any size at all, are really hard to fit into a human mind. So when I was looking at Quamina’s unreasonably-big automata and trying to figure out how they got that way, my brain was screaming “Stop the pain!”

Lesson: Prettyprint!

At the point I stalled on Quamina, I’d started a refactor based on the theory that the NFAs were huge because of a failure to deduplicate state transitions. But the code I’d written based on that theory was utterly broken; it failed simple unit tests and I couldn’t see why.

During the months when I was ignoring the problem, I privately despaired because I wasn’t sure I could ever crack it, and I couldn’t stomach more struggling with ad-hoc Printf and debugger output. So I decided to generate human-readable renditions of my automata. Given that, if I still couldn’t figure out what was going on, I’d have to admit I wasn’t smart enough for this shit and walk away from the problem.

Which turned out to be a good call. Generating an information-dense but readable display was hard, and I decided to be ruthless about getting the spaces and punctuation in the right places. Because I didn’t want to walk away.

Back in the day, we used to call this “prettyprinting”.

It worked! First of all, my prettyprinter showed me that the automata emitted based on my deduplication theory were just wrong, and what was wrong about them, and I found that code and fixed it.

Bad news: My deduplication theory was also just wrong. Good news: My prettyprinter provided unavoidable proof of the wrongness and made me go back to first principles.

And I just landed a PR that cleanly removed the state explosion.

Free advice

I’ll show off the prettyprinter output below where I dig into the state-explosion fix. But for the moment, a recommendation: If you have a data structure that’s not Working As Intended and is hard to grok, go hide for a couple of days and write yourself a prettyprinter. Prettyprinting is an intelligence amplifier. Your Future Self will thank you heartily.

“Back to first principles”?

The single best write-up on NFA and regex basics that I’ve ever encountered is Regular Expression Matching Can Be Simple And Fast (but is slow in Java, Perl, PHP, Python, Ruby, ...) by Russ Cox. It’s a discussion of, and reflection on, the regular expression library constructed by Ken Thompson in the mid-Sixties, before he got mixed up in Unix.

What’s annoying is that I had read this before I started wiring NFAs into Quamina, but ignored most of its important lessons due to a combination of not understanding them and thinking that my existing code could do what Cox described. A couple of weeks ago I went back and read it again, and it all made perfect sense and showed me the way forward. So I guess the lesson is that if you’re not Ken Thompson, you’re going to have trouble understanding what he did until you’ve tried and failed yourself?

So, major thanks to Ken for this (and Unix and other things too) and to Russ for the write-up.

Epsilon transitions

These are the magic bullet that make NFA’s work. Quamina didn’t have them, now it does. There are other bits and pieces but that’s the core of the thing.

I think the easiest way to explain is by showing you an NFA as displayed by Quamina’s new prettyprinter. It matches the regular expression "x.*9" — note that the " delimiters are part of the pattern:

 758[START HERE] '"' → 910[on "]
 910[on "] 'x' → 821[gS]
 821[gS] ε → 821[gS] / '9' → 551[gX on 9]
 551[gX on 9] '"' → 937[on "]
 937[on "] 'ℵ' → 820[last step]
 820[last step]  [1 transition(s)]
  • There’s an API to attach labels to states as you build automata, which as a side-effect gives each a random 3-digit number too. This is done in a way that can be turned into a no-op at production time.

  • 758: The start state; the only character that does anything is the opening " delimiter which transitions to state 910.

  • 910: You get here when you see the " and the only exit is if you see an x, which moves to 821.

  • 821: This state is the “glob” * operator. gS in its label stands for “glob spin”. It has an "epsilon" (ε) transition to itself. In Computer-Science theory, they claim that the epsilon transition can occur at any time, spontaneously, la-di-da. In programming practice, you take an epsilon transition for every input character. 821 also has an ordinary transition on 9 to state 551.

    This possibility of having multiple transitions out of a state on the same input symbol, and the existence of epsilon transitions, are the defining characteristics that make NFAs “nondeterministic”.

  • 551: Its label includes gX for “glob exit”. The only transition is on the closing " delimiter, to 937.

  • 937 has only one transition, on (stands for the reserved value Quamina inserts to signal the end of input) to 820.

  • 820 doesn’t do anything, but the [1 transition(s)] label means that if you reach here you’ve matched this field’s value and can transition to working on the next field.

Now I’m going to display the prettyprint again so you can look at it as you read the next paragraph.

 758[START HERE] '"' → 910[on "]
 910[on "] 'x' → 821[gS]
 821[gS] ε → 821[gS] / '9' → 551[gX on 9]
 551[gX on 9] '"' → 937[on "]
 937[on "] 'ℵ' → 820[last step]
 820[last step]  [1 transition(s)]

A little thought shows how the epsilon-transition magic works. Suppose the input string is "xyz909". The code will match the leading " then x and hit state 821. When it sees y and z, the only thing that happens is that the epsilon transition loops back to 821 every time. When it hits the first 9, it’ll advance to 551 but than stall out because the following character is 0 which doesn’t match the only path forward through ". But the epsilon transition keeps looping and when the second 9 comes along it’ll proceed smoothly through 551, 937, and 820, signaling a match. Yay!

So now, I have a fuzz test which adds a pattern for each of about thirteen thousand 5-letter words, with one * embedded in each at a random offset, including the leading and trailing positions. The add-pattern code hardly slows down at all. The matching code slows down a lot, to below 10,000/second, in stark contrast to most Quamina instances, which can achieve millions of matches/second.

I’m sort of OK with this trade-off; after all, it’s matching 10K-plus patterns! I’m going to work on optimizing it, but I have to accept that the math, as in finite-automata theory, might be against me. But almost certainly there are some optimizations to be had. There are possibilities suggested by Cox’s description of Thompson’s methods. And the search for paths forward will likely be good blog fodder. Yay!

Ken again

When I re-read Russ Cox’s piece, I was looking at the pictures and narrative, mostly ignoring the C code. When everything was working, I went back and was irrationally thrilled that my bottom-level function for one state traversal had the same name as Ken Thompson’s: step().

Also, when you process an NFA, you can be in multiple states at once; see the "xyz909" example above. When you’re in multiple states and you process an input symbol, you might end up in zero, one, or many new states. Russ writes, of Ken Thompson’s code, “To avoid allocating on every iteration of the loop, match uses two preallocated lists l1 and l2 as clist and nlist, swapping the two after each step.”

Me too! Only mine are called currentStates and nextStates because it’s 2024.

And thereby hangs a blog or maybe more than one. Because traversing the NFA is at Quamina’s white-hot center. You really REALLY don’t want to be allocating memory in that code path. Which should be straightforward. But it’s not, for interesting reasons that raise optimization problems I’m just starting to think about, but you’ll probably hear all about it when I do.

Un-generic

In the process of moving Quamina from DFAs to mixed DFA/NFA to pure-NFA I adopted and then abandoned Go’s generics. They hate me. Or I’m not smart enough. Or something. I wrote about the experience back in 2022 and while that piece ended inconclusively, I am personally much happier with generics-free Go code. Maybe they make other people happy.

Hard to understand

And then finally, there’s this one function I wrote in June 2022, doesn’t matter what it does. It has a a comment at the top that begins: “Spookeh. The idea is that…” and goes on for a long paragraph which, well, I can’t understand. Then I look at the code and think “that can’t work.” I keep thinking of sequences that should send it off the rails and write the unit tests and they fail to fail, and I use the prettyprinter and the NFA it generates is ruthlessly correct. I go back and look at it every few days and end up shaking my head. This is making me grumpy.

But after all, I did write, in a previous Quamina Diary episode: “The observation that computer programmers can build executable abstractions that work but they then have trouble understanding is not new and not surprising. Lots of our code is smarter than we are.”

But I’ll figure it out. And it’s nice to have interesting computer-programming stuff to blog about.

Sex Edit War! 15 Jun 2024, 7:00 pm

In January 2010 I drove twenty-five minutes across Vancouver to the University of British Columbia’s main library, with the goal of crushing an opponent in a Wikipedia edit war. The battleground was the entry on T.E. Lawrence (better known as Lawrence of Arabia). I won that war. As a consequence, I consider myself the world’s leading living expert on Lawrence’s sexuality.

[Note: This is posted alongside Wikipedia Pain, which is about the issues of truth and expertise in Wikipedia editing, in an effort to share what the process feels like from the inside.]

Why Lawrence, anyhow? My Dad, an Alberta farm boy, became a Professor of Agriculture, and spent most of his career in the Third World, much of it in Lebanon and Jordan. As a result, I spent my youth there, with plentiful opportunities for touristing all over the Middle East, including many of the spots that appear in Lawrence’s monumental war memoir Seven Pillars of Wisdom.

I ran across Seven Pillars in college and devoured it, from time to time thinking “I’ve been there!” While it’s full of camel charges, train-bombings, and other Ripping Yarns, it’s a difficult book, not a light read at all. But I enjoyed it and was left wondering who this guy was. So in the course of time I read Lawrence’s other works, some biographies (there are many) and especially, the collected letters.

Lawrence was an avid correspondent, sending letters almost like we do emails, multiple times most days. I suspect that a whole lot of the Lawrence biographers got the idea by reading the letters and like me thinking “who is this guy?” You might want to do a little Lawrence reading.

Conducting archeology on my blog reveals that I apparently noticed Wikipedia in 2003 and had started contributing to the Lawrence article by 2004; in that year I also wrote “Maybe the Wikipedia is a short-lived fad, maybe it’ll get better, maybe it’ll get worse, but I was surprised that nobody pointed this out: The Wikipedia is beautiful. It’s an unexpected and unexplainable triumph of collective creativity and of order over entropy. I hope it lasts a long time, and those who criticize it Just Don’t Get It.”

At that time popular opinions of The Encyclopedia That Anyone Can Edit ranged from a headshaking blow-off of the idea’s obvious craziness to active fear and hostility. British technology journalist Andrew Orlowski once referred to Wikipedians as “Khmer Rouge in daipers” (sic). I became a partisan, wading into the ring against figures as eminent as Bob McHenry, former Editor of the Britannica, who compared Wikipedia to a public toilet: “you can’t be sure who was there before you.” I enjoyed rolling out my rhetorical and polemical cannon and firing back. From December 2004: “One thing is sure: the Wikipedia dwarfs its critics.”

It must be said that back then, the critics had a point. Those of us who waded in early often found entries about major subjects of history or culture which were a stinking mess. Lawrence was one such; a farrago of conspiracy theories and thinly-sourced fantasies.

Sex!

In particular the section about Lawrence’s sexuality, a subject much discussed by his biographers and occasionally in the popular press. The amount of time I’ve put into making this fact-based would probably be regarded as ridiculous by most sane people. [Would they be wrong? -Ed.] [Pretty sure. -T.]

I have plenty of by-and-about-Lawrence books on my shelves and had read more or less every published letter, which I thought gave me a fair claim to knowing him better as a person than your average Wikipedia editor. By dint of dogged incremental citation-backed edits, I was making good progress by 2009 at introducing order to the chaos.

Edit!

Editing Wikipedia involves regular, often intense, disputes about what should be said. These take place on the “Talk” page that is associated with each article. For a contentious entry, such as Lawrence’s had become, the Talk page can become huge, much larger than the entry itself.

In these disputes, the criteria that matter are “notability” and “verifiability”. To be included, a subject must be notable, i.e. worth mentioning. When is something notable? If, and only if, there are mentions of the subject in multiple credible mainstream sources. Further, any assertion must be verifiable, i.e. there is evidence to establish that the claims in the material are correct. Both criteria are addressed by providing citations from Reliable Sources.

On the subject of verifiability, Wikipedia says to the world: Any material that is not verifiable will eventually be removed. That tenet gives a warm glow to those of us who live on the Internet and care a lot about truth and untruth.

The subject at hand was homosexuality. First, had Lawrence been gay? Second, what was his attitude toward gay people? Remember, this is a man who died in 1935; in his lifetime, homosexuality was publicly much disapproved-of and in fact specifically forbidden by law.

I thought I had the facts on my side. Whatever Lawrence’s orientation, there was no evidence of consensual intimacy with anyone of any gender, and he repeatedly and explicitly denied, in private correspondence, any experience of sex.

On the other hand, his writing includes multiple warm, approving remarks about male/male sexual relationships. So I thought the case for “celibate and tolerant” was pretty well open and shut.

War!

But then I found I had an adversary.

“Factuarius” – the handle of another active Wikipedia editor – came to fight. For reasons opaque to me, Factuarius was pretty convinced that Lawrence had been gay and/or disapproved of homosexuality. He was able to assemble citations where people had alleged relationships between Lawrence and one or another male person, but this was well-plowed ground; biographers had found an absence of evidence for the relationships and reasonably convincing reasons to doubt their having happened.

Factuarius decided that Lawrence’s having disapproved of homosexuality was the hill he was going to die on. He triumphantly produced two citations that supported his position, declared victory, and told me to stand down.

The first was “Khondakar Golam Mowla, 2008 p. 258”. The book is The Judgment Against Imperialism, Fascism and Racism Against Caliphate and Islam: Volume 1. You can buy it from Amazon for $36.49 as I write this. It turns out it is self-published at “AuthorHouse” and that its Foreword denounces, among other things, “Ataturk, a secret Jew”. The tone generally follows from there. I pointed out to Factuarius that I could go to AuthorHouse and generate a book claiming Lawrence was from Mars.

That left him hotly defending his last reference, a Lawrence letter cited in “Homosexuality and Orientalism: Edward Carpenter's journey to the east, P. K. Bakshi, Prose Studies, Volume 13, Issue 1 May 1990, pages 151-177, Routledge”. Seeing no alternative, I made that drive over to the nearest big University research library.

It took a while to track down Prose Studies, whose dusty and clearly-unvisited volumes occupy quite a few shelf-feet. It was founded in 1977 and the Internet tells me it’s still publishing. I really don’t know what this journal is for or what effect on the world, if any, its existence is designed to achieve. [Arrogant, much? -Ed.] [Trying to be polite. -T.]

Sure enough, the article about Edward Carpenter was there in the May 1990 volume. I read it. I photographed (badly, with a 2010 phone-cam) the title and index pages to prove that I had done so. The article mentioned Lawrence twice, suggesting in an off-handed way that he was an example of English fascination with homosexuality and “the Orient”. But there was nothing there that looked like Factuarius’ citation.

Victory!

I was left happy for multiple reasons. It is a wonderful thing that research libraries exist and preserve academic journals for their own sake, whether or not any human will ever consult their pages. It was pretty cool playing scholarly sleuth in the quiet passages of the library. Best of all, Factuarius retired silently from the fray.

Which was actually a pretty minor scuffle by Wikipedia standards. There is a hilarious page entitled Wikipedia:Lamest edit wars, which I recommend just for fun. It even categorizes them. The first-appearing category is “Ethnic and national feuds”, featuring the titanic struggles over the ancestries of Frédéric Chopin and Freddie Mercury. So far, none of these has metamorphosed into a real actual nation-against-nation shooting war, but I’m not saying it couldn’t happen.

Eventually I took the trouble of collecting every citable fact about Lawrence’s sexuality that I could find in all the known published resources – online search in the Gutenberg Project and various other sources helped. I published them in a blog piece entitled Sex and T.E. Lawrence, which has been useful in subsequent (much less dramatic) editing disagreements.

Finally, I gave a talk at a social-media conference sometime in the 2000s entitled Editing Wikipedia in which I had great fun relating this episode, and I think the audience did too. In particular, reading out spicy passages illustrating Lawrence’s real kink – there’s strong evidence that he was a masochist. For example, in later life, he paid to have himself whipped “severely enough to produce a seminal emission”.

The effect, at the end of all this was that material that was not verifiable – an assertion about a historically-notable person’s viewpoint on a particular issue – was, as it should be, removed from Wikipedia.

Also, pursuing the truth can be its own reward.

Wikipedia Pain 15 Jun 2024, 7:00 pm

There are voices — some loud and well-respected — who argue that Wikipedia is deeply flawed, a hellscape of psychotic editors and contempt for expertise. I mostly disagree, but those voices deserve, at least, to be heard.

[Note: There’s a companion blog post, Sex Edit War!, about my own experience in a Wikipedia Edit War. (I won! It was fun!) I hope it’ll make some of this narrative more concrete.]

Background

If you look at this post’s Reference Publishing topic, you’ll see a lot of Wikipedia-related material. I was one of its early defenders against the early-days waves of attackers who compared it to a public toilet and its editors to the Khmer Rouge.

I should also disclose that, over the years, I have made some 2,300 Wikipedia edits, created seven articles, and (what makes me happiest) contributed 49 images which have been used, in aggregate, 228 times.

I say all this to acknowledge that I am probably predisposed to defend Wikipedia.

What happened was…

Somebody spoke up on the Fediverse, saying “I wonder if reporters know that Wikipedia hallucinates too??” I’m not giving that a link, since they followed up with a post asserting that ChatGPT is better than Wikipedia. Life’s too short for that.

Anyhow, I replied “The difference is, errors in Wikipedia tend to get systematically fixed. Sometimes it takes more work than it should, but the vast majority of articles are moving in the right direction a vast majority of the time.” Much discussion ensued; follow the threads.

Shortly thereafter, the redoubtable JWZ complained about an edit to his page and I spoke up noting that the edit had been reversed, as bad edits (in my experience) usually are. That conversation branched out vigorously, dozens of contributions. Feel free to trawl through the Fediverse threads, but you don’t have to, I’ll summarize.

Gripe: Bad editors

This kept coming back.

I dunno. I don’t want to gaslight those people; if that’s the experience they had, that’s the experience they had. My own experience is different: The editors I’ve interacted with have generally been friendly and supportive, and often exceptionally skilled at digging up quality citations. But I think that these reports are something Wikipedia should worry about.

Gripe: Disrespect of expertise

By number and volume of complaints, this was the #1 issue that came up in those threads:

I generally disagree with these takes. Wikipedia not only respects but requires expert support for its content. However, it uses a very specific definition of “expert”: Someone who can get their assertions published in one or more Reliable Sources.

I think that if you’re about to have an opinion about Wikipedia and expertise and citations, you should give that Reliable-Sources article a careful read first. Here’s why: It is at the white-hot center of any conversation about what Wikipedia should and should not say. Since Wikipedia is commonly the top result for a Web search, and since a couple of generations of students have been taught to consult but not cite it, the article is central to what literate people consider to be true.

Let’s consider the complaints above. Mr Dear literally Wrote the Book. But, I dunno. I went and looked at the PLATO article and subjects linked to it, and, well, it looks good to me? It cites Mr Dear’s book but just once. Maybe the editors didn’t think Mr Dear’s book was very good? Maybe Dear says controversial things that you wouldn’t want to publish without independent evidence? The picture is inconclusive.

As for Mr O’Neill’s complaint, no sympathy. Given the social structure of capitalism, the employees and leadership of a company are the last people who should be considered Reliable Sources on that company. Particularly on anything that’s remotely controversial.

Mr Zawinski is upset that the person who chooses citations from Reliable Sources “knows nothing”, which I take to be an abbreviation for “is not a subject-matter expert”. There’s some truth here.

When it comes to bald statements of fact, you don’t need to be an expert; If more than one quality magazine or academic journal says that the company was incorporated in 1989, you don’t need to know anything about the company or its products to allow “founded in 1989” into an article.

On the other hand, I think we can all agree that people who make significant changes on articles concerning complex subjects should know the turf. My impression is that, for academic subjects, that condition is generally met.

Mr Rosenberg, once again, is upset that his personal expertise about the PS3 is being disregarded in favor of material sourced from a gamer blog. I’d have to know the details, but the best possible outcome would be Mr Rosenberg establishing his expertise by publishing his narrative in a Reliable Source.

Bad Pattern

There’s a pattern I’ve seen a few times where a person sees something in Wikipedia in an area where they think they’re knowledgeable and think it’s wrong and decide “I’ll just fix that.” Then their edits get bounced because they don’t include citations. Even though they’re an “expert”. Then that person stomps away fuming publicly that Wikipedia is crap. That’s unfortunate, and maybe Wikipedia should change its tag-line from “anyone can edit” to “anyone who’s willing to provide citations can edit.”

Implications

This policy concerning expertise has some consequences:

  1. The decision on who is and isn’t an expert is by and large outsourced to the editorial staff of Reliable Sources.

  2. There are ferocious debates among editors about which sources are Reliable and which are not, in the context of some specific article. Which is perfectly appropriate and necessary. For example, last time I checked, Fox News is considered entirely Reliable on the finer points of NFL football, but not at all on US politics.

  3. There are many things which people know to be true but aren’t in Wikipedia and likely never will be, because no Reliable Source has ever discussed the matter. For example, I created the East Van Cross article, and subsequently learned the story of the cross’s origin. I found it entirely convincing but it was from an guy I met at a friend’s party who was a student at the high school where and when the graphic was first dreamed up. I looked around but found no Reliable Sources saying anything on the subject. I doubt it’ll ever be in Wikipedia.

What do you think of those trade-offs? I think they’re pretty well OK.

The notion that anyone should be allowed to add uncited assertions to Wikipedia because they think they’re an expert strikes me as simultaneously ridiculous and dangerous.

Real problems

Obviously, Wikipedia isn’t perfect. There are two problems in particular that bother me all the time, one small, one big.

Small first: The editor culture is a thicket of acronyms and it’s hard to keep them straight. I have considered, in some future not-too-fierce editorial debate, saying “Wait, WP:Potrezebie says you can’t say that!” Then see if anyone calls me on it.

The big problem: The community of editors is heavily male-dominated, and there have repeatedly been credible accusations of misogyny. I have direct experience: I created the article for Sarah Smarsh, because we read her excellent book Heartland in my book club, then I was shocked to find no entry. Despite the existence of that mainstream-published and well-reviewed book, and the fact that she had published in The Guardian and the Columbia Journalism Review, some other editor decreed that that was insufficient notability.

At the time, I reacted by gradually accumulating more and more citations and updating the draft. Eventually she published another book and the argument was over. These days, in that situation I would raise holy hell and escalate the obstruction up the Wikipedia stack.

To Wikipedia’s credit, its leadership knows about this problem and gives the appearance of trying to improve it. I don’t know the details of what they’re trying and whether they’re moving the needle at all. But it’s clearly still a problem.

Once again…

I stand by what I said in December 2004: Wikipedia dwarfs its critics.

Parable of the Sofa 1 Jun 2024, 7:00 pm

When Lauren was pregnant with a child who’s now turning 25, we purchased a comfy dark-brown leather sofa which fits our living room nicely. What with kids and relatives and employees and cats and Standards Comittees and friends and book clubs and socials, the butt-support cushions had, a quarter century later, worn out. So we had them replaced, at a fair price, by a small local business. Which is something that modern capitalism is trying to make impossible.

Worn leather with a phone for scale

I’ll be honest; when we realized how ratty the sofa was getting, my first thought was “crap, gonna have to buy a sofa”. But Lauren said “No, because new sofas are junk. Also, Luxcious.”

I’ll get to Luxcious in a bit, but it turns out that new sofas, by and large, really are. Why would that be? Well, check out Why Are (Most) Sofas So Bad? in Dwell magazine which has a weirdly-intermittent paywall, here’s another version.

From early in the piece: “Sofas made in the past 15 years or so are absolute garbage, constructed of sawdust compressed and bonded with cheap glue, simple brackets in place of proper joinery, substandard spring design, flimsy foam, and a lot of staples.” It’s excellent, well-written, and will take you some surprising places.

But the subtext is drearily familiar. Globalization: Check. Cheap-labor arbitrage: Check. Tax engineering: Check. High profits: Check. Flat-packing: Check. Late Capitalism: Check check fucking check.

But, quality furniture is expensive to make, and should be, but doesn’t wear out fast, thus deserves extended maintenance.

Luxcious

Its Web site (“Breathe new life into old furniture”) is way prettier than its location, in an old and extremely miscellaneous high-traffic zone: auto-body shops, hipster lounges, self-storage, beauty supplies…

Luxcious on Main Street

They’re family-run and idiosyncratic. You have to know how to find the sketchy rear parking lot and walk in the back door. But they’re friendly and competent. Here’s the new leather they bought for the cushions.

One cow’s worth of leather

And here’s the sofa with the re-covered cushions in place.

Sofa with refinished cushions

Yes, from this angle, the new cushions make the sofa’s back look shabby, but it’s not as obvious to the naked eye and after a decade or so we’ll never notice it.

The whole job cost us $1100 Canadian. Given that the sofa cost three-thousand-plus 1999 dollars and new leather sofas of the “not flat-packed sawdust and glue” variety quickly get into five figures, the choice was a no-brainer.

“Lifestyle”

This kind of transaction is exactly what modern capitalism is trying to stamp out.

A single-location family-owned business that provides a living for a few people? With no plans to load up on debt or other financial engineering? Or for growth into unicorn status? No GenAI dimension? No marketing or public-relations people?

In conversation with venture capitalists, you hear the phrase “lifestyle business”, meaning one that is doing nicely and rewarding the people who run it and which isn’t planning for unbounded growth. The words “lifestyle business” are always, of course, uttered in a voice dripping with contempt. Luxcious is a lifestyle business.

It seems blindingly obvious that an economy with a higher proportion of lifestyle businesses is going to be more resilient, more humane, and immensely more pleasant than the one that the Leaders Of Industry are trying to build.

How would we get there from here? I’m not smart enough to figure out what the regulatory regime is that would ban most of what private-equity does and tilt the playing field in favor of resilient lifestyle businesses.

But I’d sure vote for a political party that convinced me it was trying to achieve that.

Tedeschi Trucks 26 May 2024, 7:00 pm

Saturday night we went a concert by the Tedeschi Trucks Band (TTB). It was excellent and this is partly a review, but mostly a challenge to the community of touring musicians: “Why aren’t your production values as good as TTB’s?”

Just the Facts

TTB lives squarely in the middle of the Southern Rock genre, as invented by the Allman Brothers in 1970 or so. Derek Trucks is the nephew of the Allmans’ original drummer Butch Trucks and performed in a later iteration of that band. Susan Tedeschi had a successful career as a touring and recording blueswoman. Then she and Derek got married and merged their acts.

Tedeschi Trucks Band in concert in Vancouver in 2024

It’s a twelve-piece band: Susan and Derek on guitar, three backup vocalists, three horns, keyboards, bass, and two drummers (one white, one black, per the Southern-Rock canon). The music is blues and soul, wandering into rock. Some of the songs are their own, others genre chestnuts (Statesboro Blues, High Time We Went). They played a three-hour show, but with not that many songs, because every tune features extended instrumental sections. All twelve members got a chance to shine, Derek had a break on every song, and Susan on quite a few.

What was great

Only a couple of the songs weren’t memorable; they write well and cover only the best chestnuts. The musicianship was stellar, with electric guitar front and center. Derek is fluid and effortless, with beautiful tone; Susan solos less but actually plays more interesting stuff. Susan’s the lead voice but four other members are singers, they all got a featured spot and were all pretty great. Susan doesn’t have the vocal range or the shriek, but she had the most soul.

What was best, though — out into “fucking awesome” territory — was what classical musicians call “ensemble” and I guess I’d call “band musicianship”. The songs’ arrangements are just razor-sharp, full of shifts and and breaks and little moments of drama and grace, intros and outros and bridges. The players were effortlessly locked onto the center of the rhythm, “so tight they were loose” as the saying goes. The amount of practicing this takes must be epic.

Which was brilliantly supported by the sound people. Every instrument and voice was distinct and clear, and the dynamic range was maybe the best I’ve ever heard from an electric-guitar-based band. Every moment was multilayered and you could hear all the layers.

You could tell (well, if you know something about concert sound, you could) that, at the soundboard, they were intervening judiciously, for example cranking the horns (or backup singers) and fading the guitars when that’s what the song needed.

It was an audience that was fun to be part of, enthusiastically cheering all the solos and regularly leaping to their feet when a song hit the big up-curve. Especially impressive given that plenty of the crowd was old enough to have been there for the birth of Southern Rock.

On top of which, the lighting was subtle and dramatic and tasteful, and only once in the whole three-hour show did they hurt my brain by obnoxiously flashing brilliant lights in my eyes.

Thus my challenge:

To every touring band: Be like TTB!

Seriously; My time on earth covers most of the history of live electric-music performance, plus I’m an audiophile, and for most of my life, most of the sound has been shitty. But in the last few years I’ve regularly heard sound that was better than acceptable, and occasionally dazzlingly-good. But TTB is the most impressive combination I’ve heard of big ensemble, plenty of electric guitar, and sparkling sound.

There is sort of an excuse: Rock, historically, has been carefully engineered to sound good on car radios; specifically the kind of car radios owned by impecunious youth. Dynamic range and layering are not features of this landscape.

Anyhow, my exposure to TTB, prior to this, has been mostly YouTube, and I’ve enjoyed them, but I dunno, now that I’ve heard the real thing, I suspect the online version will feel thin.

If TTB can do it, any band can. But plenty still don’t. That’s increasingly just not acceptable. I wonder if things will start to get generally better? Because I’m pretty sure the musicians care.

Other observations

Running a 12-piece operation must be freaking expensive. I would love to hear the details of the economics. Saturday night they filled a 2600-seat hall with an average ticket price around C$120. So that’s over C$300K gross. The hall costs C$21K and then there’s Ticketmaster’s cut, which if the claims of the recent DOJ litigation are to be believed, would be egregious.

I wonder how a TTB song gets built? In particular, who does the arrangements? Whoever it is, I’m a fan.

Lauren and I were masked (N95) and looking across the audience as far as we could see revealed one other masked person. I dunno, 2600 people in an enclosed space. Call me crazy, but… no, call them crazy. I’m serious.

Unusually, there were huge line-ups for the men’s washrooms, almost none for the women’s. The lady in the row behind us cackled and said “boomer prostates.”

The Colors of Racism 17 May 2024, 7:00 pm

Recently, somewhat by accident, I stumbled into reading a couple of monstrously racist texts, and I’m going to need to update the Wikipedia entry for a famous British author. But I learned a few things along the way that I want to share.

Disclosure

I try to be antiracist, but I don’t think I’m particularly good at it. I sometimes have bigoted feelings but try hard to recognize and not act on them. I’m convinced that humans are naturally tribal and antiracist work will continue to be required for the foreseeable future.

The Author

Anthony Trollope (1815-1882) wrote 47 novels. I generally like them and we own a whole shelf-full. They are funny and tender and cynical; his characters love and marry and go into business and get elected to Parliament and are corrupt and engage in furious professional conflict. Those characters are, without exception, “gentle”, by which I mean members of the British ruling class.

Anthony Trollope in 1864

Anthony Trollope in 1864.

When I was traveling around the world a lot, regularly crossing major oceans before the era of in-air Internet, Trollope was a regular companion; his books tend to be big and thick and extremely readable. Want to get started? Barchester Towers, about a bitter feud among the clergymen of an English country town, is one of the funniest books ever written; also there’s an excellent BBC adaptation, with Alan Rickman deliciously smarmy as the horrid Mr Slope.

What happened was…

I’m on a publishing-oriented mailing list and someone wrote “I stumbled on the fact that Trollope wrote a book that describes race relations in the British West Indies” and someone wrote back “It’s a travelogue not a novel, it’s called The West Indies and the Spanish Main, and be careful, that race-relations stuff may not be pleasant to read.” On a whim, I summoned up the book from our excellent public-library system and, oh my goodness gracious, that “not pleasant” was understating it.

The book

Trollope earned his living, while he was establishing his literary career, as an official of the British Post Office, rising to a high level in the organization and not leaving it until he was almost 50.

In 1859, he was sent to reorganize the Post Office arrangements in the West Indies and the “Spanish Main”, the latter meaning southern Central America and northern South America. The expedition lasted several months and yielded this book. In his autobiography, Trollope wrote that he thought it “the best book which has come from my pen.” I think history would disagree. It’s on the Internet Archive, but I’m not linking to explicit racism.

So why am I going to write about it?! Because now, 165 years after this book, racism and its consequences remain a central focus of our cultural struggles. Understanding the forces we combat is kind of important. Also, I recently researched and wrote about the Demerara Rebellion (of the enslaved against their oppressors, in 1823) so I have more context on Trollope’s observations than most.

Background

Trollope’s tone is grumpy but good-humored. In the places he visits, he is generally contemptuous of the hotels, the food, the weather, and the local government.

The main narrative starts in Jamaica. By way of background, slavery had been abolished in 1833, just 25 years before. Many of the sugar plantations that occupied most of Jamaica had collapsed. Thus this:

By far the greater portion of the island is covered with wild wood and jungle… Through this, on an occasional favourable spot, and very frequently on the roadsides, one see the gardens or provision-grounds of the negroes…

These provision-grounds are very picturesque. They are not filled, as a peasant’s garden in England or in Ireland is filled, with potatoes and cabbages, or other vegetables similarly uninteresting in their growth; but contain cocoa-trees, breadfruit-trees, oranges, mangoes, limes, plantains, jack frout, sour-sop, avocado pears, and a score of others, all of which are luxuriant trees, some of considerable size, and all of them of great beauty… In addition to this, they always have the yam, which is with the negro somewhat as the potato is with the Irishman; only that the Irishman has nothing else, whereas the negro generally has either fish or meat, and has also a score of other fruits beside the yam.

We wouldn’t use that word any more to describe Black people, but it was thought courteous in Trollope’s day. He does deploy the N-word, albeit rarely, and clarifying that it was normally seen, even back then, as an insult.

The bad stuff

It comes on fast. In the Jamaica chapter, the first few subheadings are: “Introduction”, “Town”, “Country”, “Black Men”, “Coloured Men”, and “White Men”. That “Black Men” chapter begins with six or so pages of pure racist dogma about the supposed shortcomings of Black people. I will not enumerate them, and obviously none stand up to the cold light of scientific inquiry.

But then it gets a little weird. Trollope notes that “The first desire of a man in a state of a civilization is for property… Without a desire for property, man could make no progress.” And he is harsh in his criticism of the Black population for declining to work long shifts on the sugar plantations in hopes of building up some capital and getting ahead.

And yet Trollope is forced to acknowledge that his position is weak. He describes an episode of a Black laborer knocking off work early and being abused by an overseer, saying he’ll starve. The laborer replies “No massa; no starve now; God send plenty yam.” Trollope muses “And who can blame the black man? He is free to work or free to let it alone.” It is amusingly obvious that this is causing him extreme cognitive dissonance.

And he seems shockingly oblivious to issues of labor economics. On another occasion it is a group of young women who are declining the hot nasty work in the cane fields:

On the morning of my visit they were lying with their hoes beside them… The planter was with me, and they instantly attacked him. “No, massa; we no workey; money no nuff,” said one. “Four bits no pay! no pay at all!” said another. “Five bits, massa, and we gin morrow ’arly.” It is hardly necessary to say that the gentleman refused to bargain with them… “But will they not look elsewhere for other work?” I asked. “Of course they will,” he said; “… but others cannot pay better than I do.”

(A “bit” was one eighth of a dollar; I can remember my grandfather referring to a quarter, i.e. a 25¢ coin, as “two bits”.)

They’re demanding a 20% raise and, as is very common today, the employer deems that impossible.

Trollope contrasts the situation in Barbados, where there is no spare land and thus no “provision grounds” and the working class (in this case, all-Black) is forced to labor diligently for their daily bread; and is confident that this is better.

He also visits Cuba, where slavery is still legal, and visits a plantation with an enslaved workforce: “During the crop time … from November till May, the negroes sleep during six hours out of the twenty-four, have two for their meals, and work for sixteen! No difference is made on Sunday.” Trollope’s biggest concern was that the enslaved received no religious instruction nor opportunities to worship.

Trollope regularly also has to wrestle with the tension that arises when he meets an accomplished or wise or influential Black person. For example, upon arriving in New Amsterdam (in Demerara):

At ten o’clock I found myself at the hotel, and pronounce it to be, without hesitation, the best inn, not only in that colony, but in any of these Western colonies belonging to Great Britain. It is kept by a negro, one Mr. Paris Brittain, of whom I was informed that he was once a slave… he is merely the exception which proves the rule.

Here are two more samples of Trollope twisting himself in knots over what seems to him an economic mystery.

But if the unfortunate labourers could be made to work, say four days a week, and on an average eight hours a day, would not that in itself be an advantage ? In our happy England, men are not slaves ; but the competition of the labour market forces upon them long days of continual labour. In our own country, ten hours of toil, repeated six days a week, for the majority of us will barely produce the necessaries of life. It is quite right that we should love the negroes ; but I cannot understand that we ought to love them better than ourselves.

The complaint generally resolves itself to this, that free labour in Jamaica cannot be commanded; that it cannot be had always, and up to a certain given quantity at a certain moment ; that labour is scarce, and therefore high priced, and that labour being high priced, a negro can live on half a day's wages, and will not therefore work the whole day — will not always work any part of the day at all, seeing that his yams, his breadfruit, and his plantains are ready to his hands.

In what sense is England “happy”? Granted, it’s obvious from the point of view of the “gentle” ruling class, none of whom are doing manual labour sixty hours per week.

That aside, the question he raises still stands, two centuries later: Why should anyone work harder than they need to, when the benefits of that work go to someone else?

“Coloured”

There’s lots more of this, but it’s worth saying that while Trollope was racist against Blacks, he was, oddly, not a white supremacist. He considers the all-white colonial ruling class to be pretty useless, no better than the Blacks he sneers at, and proclaims that the future belongs to the “coloured” (i.e. mixed-race) people. He backs this up with some weird “Race Science” that I won’t go into.

Unforgivable

Trollope’s one episode of pure venom is directed at the already-dying-out Indigenous people of the region, pointing out with approval that one of the island territories had simply deported that whole population, and suggesting that “we get rid of them altogether.” This seems not to be based on race but on the observation that they “more than once endeavoured to turn out their British masters”. Colonialism is right behind racism in the line-up of European bad behaviors. It may also be relevant that he apparently did not meet a single Indigenous West-Indian person.

Meta-Trollope

I finished reading The West Indies and the Spanish Main because Trollope’s portrayals of what he saw were so vivid and I couldn’t help being interested.

I had read Trollope’s autobiography and some more bits and pieces about him, and had encountered not a word to the effect that whatever his virtues and accomplishments, he was shockingly racist. So I checked a couple of biographies out of the local library and yep, hardly a mention. One author noted that The West Indies and the Spanish Main was out of tune with today’s opinions, but there was no serious discussion of the issue. Wikipedia had nothing, and still doesn’t as I write this, but I plan to fix that.

I dug a little harder here and there around the Internet and turned up nothing about anti-Black racism, but a cluster of pieces addressing antisemitism; see Troubled by Trollope? and Why Anthony Trollope Is the Most Jewish of the Great English Novelists. There are a few Jews in Trollope’s novels, ranging from wholly-admirable heroes (and heroines) to revolting villains. So you might think he comes off reasonably well, were it not for casual splashes of antisemitic tropes; the usual crap I’m not going to repeat here.

In case it’s not obvious, Trollope’s writings and opinions were strikingly self-inconsistent, often within the course of a few pages. Well, and so is racism itself.

At that point in history there was an entire absence of intersectionalist discourse about racism being, you know, intrinsically bad, and there were many who engaged in it enthusiastically and sincerely while remaining in polite society.

Trollope’s racism is undeniable, but then he (once again, inconsistently) sounds non-racist in theory. (However, he was gloomy about the attitudes of the white population.) Check this out:

It seems to us natural that white men should hold ascendency over those who are black or coloured. Although we have emancipated our slaves, and done so much to abolish slavery elsewhere, nevertheless we regard the negro as born to be a servant. We do not realize it to ourselves that it is his right to share with us the high places of the world, and that it should be an affair of individual merit whether we wait on his beck or he on ours. We have never yet brought ourselves so to think, and probably never shall.

That text feels remarkably modern to me. I am a little more optimistic than he is in his closing four words; some white people work hard at antiracism. But for a lot of white people, his take remains depressingly accurate.

Degrees of racism?

I suspect that, if Trollope were with us today, his writings would probably be conventionally antiracist. His opinions were solidly in his era’s mainstream and I suspect he would find himself in ours, because he was really a pretty conventional and actually kind of boring person.

With the single exception of those two sentences about the Indigenous people, he seems to exhibit no particular emotional bias against any ethnic group.

Why, you might wonder, do I mention this? Therein lies a tale. In his autobiography, when he discusses The West Indies and the Spanish Main, he notes that it received a favorable review in The Times of London. I thought I’d like, for the sake of context, to read that. (Thanks to William Denton for retrieving the page images.)

I certainly didn’t enjoy reading The West Indies (unsigned) from early 1860 in The Times. It fills most of a broadsheet page, dozens of column-inches one after the other oozing vitriolic hate of Black people. I’m not going to even try to describe it any further; I felt literally nauseated in reading and didn’t make it through to the end.

I suspect that if that Times writer were with us today, he’d be an unreconstructed alt-right dog-whistler, a good ole boy in a MAGA hat.

Reading this crap made me feel a little less angry about Trollope, who generally liked people. Here’s what I think I learned: Racism comes in multiple flavors. There are some people (like Trollope) who are intersectionally bigoted in a sort of unthinking and incurious way, but not that emotionally bound to it. These are the people that need to hear the antiracist message, loudly and clearly, over and over. Because they might listen and learn.

Then there are the others. In 1860, that Times reviewer. Today, the slave-state GOP MAGAs, the Israeli settler movement, Modi’s Hindutva hoodlums. They genuinely hate The Other, down in their bellies. It’s how they define themselves. Talking to them is useless. They have to be defeated and removed from positions of power and influence. Then, thankfully, they can be ignored. Because listening to them is useless too.

Storage Churn 5 May 2024, 7:00 pm

What are the highest-impact Cloud services? Storage would be near the top of any list. Where by “Storage” I mean what S3 does: Blobs-of-bytes storage that is effectively unlimited in capacity, credibly more durable than anything you could build yourself, and easily connected to the world, either directly or through a CDN. I think we’re entering a period of churn where there’s going to be serious competition on storage price and performance. Which, by the way, is crucially relevant to the Fediverse.

Let’s start with AWS, since they invented the modern Storage concept. The most important thing about S3 is this: There appear to be zero credible reports of S3 data loss. Given the number of objects it holds, and the numbers of years it’s held them, that’s remarkable.

It’s a safe place to store your data. Yeah, the API is a little klunky, and the latency can be high, and the hardwired bucket/object hierarchy is annoying, and so are the namespace issues. And it’s not cheap.

But it’s safe. And fast enough to be useful. And safe. And dead easy to connect up to a CDN. And did I mention that it’s safe?

S3…

AWS, to their credit, aren’t resting on their laurels. Here is a good Register interview with Andy Warfield, one of the lead S3 engineers and also a really good person. He’s talking about another variation on the basic S3 service, called “Express”, which has more filesystem-y semantics, higher performance, but (reading between the lines) a little less durability? (Also, more expensive.)

What’s notable about S3 isn’t this particular feature, but the fact that AWS keeps rolling out new ones. So it’s a moving target for the emerging competition.

…but cheaper…

In recent years and especially over the last few months, alternatives and competitors to S3 keep crossing my radar. A bunch of them have a premise that’s essentially “S3-compatible, but cheaper”: Backblaze B2, Digital Ocean Spaces, Wasabi, IDrive e2, Cloudflare R2, and Telnyx Cloud Storage. I’m sure I’ve missed some.

…and faster!

Some of the products claim to be way faster. Which matters if it’s true, but so far I don’t know of any popular benchmarking standards, so I’d take the numbers with a grain of salt. If I really cared, for a big project, I’d want to try it with my own code.

Here are a few of those:

S2

See Designing serverless stream storage. This is still more a research project than a product, but I drop it in here because it says that access to S3 Express made it possible. Its claim to fame appears to be higher performance.

Tigris

Tigris offers what they describe as “Globally Distributed S3-Compatible Object Storage”. I think the best description of what that means is by Xe Iaso of Fly.io, in Globally Distributed Object Storage with Tigris. It’s not just well-written, it’s funny. Apparently Fly.io bundles Tigris in, with command-line and billing integration.

Bunny

The fastest object storage, replicated to the edge is their big claim.

CDN?

Bunny sounds like it’s partly a CDN. And it’s not the only one. Which makes obvious sense; if you want to deliver the stuff you’re storing to users around the world at scale, you’re going to be hooking your storage and CDN together anyhow. So those lines are going to stay blurry.

Compatibility and intellectual property

S3 compatibility is an issue. It’s interesting that AWS has apparently decided not to defend the S3 API as intellectual property, and so these things cheerfully claim 100% plug-compatibility. And when they don’t have it, they apologize (that apology looks unusually far under the covers; I enjoyed reading it).

Durability?

They may claim compatibility, but mostly do not claim equivalent durability. I’ll be honest; if I were picking one, that would worry me. I’d need to see pretty full disclosure of how the services work under the covers.

Unknowns

I just mentioned durability, which is a technology issue. The other big unknowns are about business not technology. First of all, can you sustainably make money selling storage at a price that undercuts AWS’s? I haven’t the vaguest idea.

Second, is this a threat to AWS? There is a vast amount of data that is never gonna migrate off S3 because who’s got the time for that, but if the competition really can save you a lot of money that could hit S3’s growth hard, and Amazon wouldn’t like that. Who knows what might happen?

Now let’s change the subject.

Fediverse storage

I’ll use myself as a Fediverse example. As I write this, my @timbray@cosocial.ca Mastodon account has just over 18K followers, distributed across 3K-and-change instances. So whenever I post a picture or video, each of those instances fetches it and then keeps its own copy, if only in a short-lived cache.

All these files are immutable and identical. Smell an opportunity? Yeah, me too. Someone needs to build an object-store/CDN combo (I’ve already heard people say “FDN”). The API should cater to Mastodon’s quirks. You could split the cost equally or deal it out in proportion to traffic, but either way, I think there’d be big cost savings for nearly every instance.

Furthermore, it doesn’t feel technically challenging. If I were still at AWS, I’d be working on a PR/FAQ right now. Well, except for, since everything is S3-compatible and CDNs are commoditized, it would be plausible (and attractive) to build your FDN in a way that doesn’t tie you to any particular infrastructure provider.

Someone has already started on this; see Jortage Communal Cloud; small as of yet, but pointing in the right direction.

Fun times!

The storage world is a market with no monopolist, where providers are competing on price, performance, and durability. Be still my beating heart.

Photointegrity 29 Apr 2024, 7:00 pm

In March of 2004, just over twenty years ago, I published an ongoing piece entitled, like this one, “Photointegrity”. The issue remains the same, but the rise of AI increases its importance and its difficulty. Here are words on the subject, illustrated by photos all of which have been processed with AI technology.

Pink-orange tulip blossom, folded closed

Tulip blossom, captured with twenty-year old analog technology, enhanced with AI.

There’s an amusing story about the technology behind these flower pictures, down at the bottom of this piece.

Back in 2004

I was already using Photoshop but in fully-autodidactic mode, so I thought I should read a book about it, and selected one by Scott Kelby, “The Photoshop guy” back then and still active in the space, two decades later. It was a good book, but I was left wide-eyed and shocked: I’ll quote from that piece for those of you who don’t want to step back twenty years in time and read it:

Personal Improvement

In particular, Kelby walks through an astounding list of techniques for improving portraits, and I quote: removing blemishes, removing dark circles under the eyes, lessening freckles or facial acne, removing or lessening wrinkles, colorizing hair, whitening the eyes, making eyes that sparkle, enhancing eyebrows and eyelashes, glamour skin softening, transforming a frown into a smile, doing a digital nose job, slimming and trimming, removing love handles, and finally slimming buttocks, arms and thighs.

Integrity?

Screw it, integrity is history. The image is no longer the capture of a instant’s light and colour, it’s, well… whatever you and Photoshop make of it.

Photointegrity

I proposed a definition at the time: “what I’m going to do is strive to balance Truth and Beauty. In practical terms, this means the goal is make the picture look as much as possible like what I saw, as opposed to as good as possible.”

Simple yellow flower, two buds peeking round its edges

Simple yellow flower, captured with twenty-year old analog technology, enhanced with AI.

I can’t claim that I follow that strictly; most of the pictures in this space come out of the camera looking less pleasing than what I remember seeing, but I will confess that the version you see is often prettier than that memory. Usually, that results from the application of a bunch of Adobe technologies.

Is that OK? It’s a judgment call. Is there anything that isn’t a judgment call? Funny you should ask, because Adobe just announced the Firefly Generative AI 3 model, around which the next version of Photoshop is being built. Hitting those links and just scrolling through the pictures will give you a feeling for what this software will do.

Let me put a stake in the ground. I believe these things:

  1. If you use generative tools to produce or modify your images, you have abandoned photointegrity.

  2. That’s not always wrong. Sometimes you need an image of a space battle or a Triceratops family or whatever.

  3. What is always wrong is using this stuff without disclosing it.

The C2PA angle

Last October, I wrote up C2PA, a useful digital watermarking technology that can be used to label images and video. That piece’s predictions look like they’re coming true; several manufacturers have announced C2PA support. I’m not going to take the space here to describe C2PA again.

I do note that Photoshop already supports C2PA and when it writes a watermark saying “Edited with Photoshop”, that label includes a very few words about what it did: cropping, levels adjustment, and so on; no details.

I believe strongly that when people use Adobe’s Firefly generative AI to create or augment pictures, Photoshop should by default turn C2PA labeling on, and disclose in the watermark whether it is fully-generated or just augmented. Sure, the person generating the image can always take that watermark out, but they can’t change its contents, and assuming C2PA becomes ubiquitous, the absence of a watermark would be reasonable grounds for suspicion.

Cluster of pink fruit-tree blossoms, just opening

Fruit tree blossoms, not open yet, captured with twenty-year old analog technology, enhanced with AI.

AI + photointegrity?

Over the last couple of years, the way I use Adobe Lightroom has changed a whole lot, and it’s mostly because of AI. Specifically, smart select. Lightroom now offers Select functions for Subject, Background, Sky, and Object. There’s also a very useful “Duplicate and invert” for any selection. I use these for almost every photo I take, especially Select Sky. The amount of light in the sky differs from that down here on the surface, and I’m pretty sure that our eyes compensate for that. Almost every picture looks more “real” when you select the sky and dial the brightness down (rarely: up) a touch, and maybe bump the contrast a bit.

This photo would have been a complete failure without those tools.

Allyson’s parents at her memorial

Allyson’s parents speak to the crowd at her memorial.

Recently we were at a memorial social for our recently-deceased friend Allyson. It was on a rooftop, on a bright grey day; the volume of light coming out of the sky was insane, and kept turning my photographic subjects into dark silhouettes.

The photo of Ally’s parents addressing the crowd is not great (her mom’s eyes are closed) but it at least captures a moment. The original was totally unusable, because the subjects are under a canopy and thus shaded, while the sky and cityscape and even mountains were reflecting harshly. So you select the subject, you invert and duplicate, you add light to the subject and subtract from the rest, and you get something that looks exactly like what I saw.

Of course, this depends on a good camera with a lot of dynamic range that can fish detail out of shadows.

I think this process retains photointegrity.

AI-enhanced analog

What happened was, the sun came out after the rain, everything is blooming this time of year, and I wanted to take pictures. I was rummaging for lenses and there was this dark shape at the back of the shelf. “What’s that?” I thought. It turned out to be an old Pentax with “Macro” in its name. Just the thing! Here’s what the combo looks like.

Pentax 100mm macro lens strapped on Fujifilm X-T30 Pentax 100mm macro lens strapped on Fujifilm X-T30

By the way, one reason the Internet is still good is that random enthusiasts maintain obscure databases, for example of camera lenses, from whence this smc Pentax-D FA 100mm F/2.8 Macro, an alternate version of which rejoices in the name “Schneider-Kreuznach D-Xenon”. It seems to have been manufactured only around 2004. I wrote about buying it in 2011 and shooting flowers and dancers with it in 2014; lotsa groovy pix in both.

Anyhow, this lens does a fabulous job of isolating foreground and background. Given this to chew on, Lightroom’s AI gizmo does a fabulous job of selecting just the flower (or background). So it’s easy to sharpen the flower and fade the bokeh; the old lens and the bleeding-edge software were made for each other.

But I digress.

Photointegrity matters

It mattered in 2004 and it matters more every passing year as our level of trust in online discourse falls and the power of generative AI grows. We have the tools to help address this, but we need to think seriously, and use them when appropriate.

Mobile Typing Pain 24 Apr 2024, 7:00 pm

I ran a Fediverse poll asking how people go about entering text on mobile devices. The results shocked me: Half the population just taps away. Do you? Read on for details and speculation.

This ongoing fragment embeds links to previous fragments, because I’ve been worrying and writing about this problem for a long time. Which in itself is interesting, more evidence that the problem is hard.

Mastodon poll on mobile text entry options

The poll post and (long) chain
of responses are here on Mastodon.

People care

First, 3.5K poll responses is more than I usually see on the Fediverse; evidence that plenty of people have feelings about this. To reinforce that impression, scroll down through the responses (there are dozens). Many say, essentially, “Entering text on a mobile device is too hard, so I don’t.”

I’m one of those; I regularly start entering a message into a phone, stop, get up, and walk across the room to a real keyboard.

Tap tap wow

I widened my eyes when I saw that half the respondents testify to tapping along letter by letter. I could never. But then I have big fat farmer’s fingers with soft ends, and am not terribly dextrous.

But, probably, I shouldn’t have been surprised; 21 years ago in this blog I remarked that “it's pretty impressive to watch a Japanese person pounding text into their PDA at high speed using just their thumbs.” And today I watch teenage digits dance on devices like maddened maenads; they seem not to find it tedious.

Swiping etc

A quarter of poll respondents reported swiping words into their phones.

I mentioned above that people have been working on this for a long time. Check out this progress report from ongoing in 2011. It’s worth noting that Android’s input method being replaceable was important in driving this innovation.

My own proposal, the Digitator, has failed to break through.

That piece concludes “Anyhow, I’m pretty sure that something will come along.” But on the evidence it hasn’t, really.

The AI angle: Auto-predict and voice

The least popular poll options were tap-plus-autopredict and voice. I guess I’m weird, because those are what I mostly use. I suspect many others should too but don’t, probably because they tried those things a while ago and haven’t revisited them recently.

In my experience (which, granted, is almost all on Google Pixel devices) the autopredict and voice options have gotten stronger with almost every release. Not just a little bit stronger, either. Perhaps it’s just because I’m the white male Anglophone “canonical human” that designers build for, but I get dramatically better results than I used to.

Now obviously, most reasonable people will only talk to their phone when they’re in a private place, which limits the use of that option. But if you can find privacy, the voice option is getting remarkably good.

Which is to say, I can enter message or email text in at a pace that is sometimes adequate. Do I enjoy doing this? No, I hate it, as I noted above, and will make a real effort to switch to a keyboard.

In particular if what I want to enter is important, might matter.

Because anything that matters deserves editing, and it’s rare indeed that I hit “Send” on a first draft. And while brute-force text entry is edging into adequacy, editing remains a pool of pain.

Subtext

Two and a half decades into this millennium, the most popular communication products are optimized for consumption and barely adequate for creation. If I were paranoid and cynical, I might suspect that this is no accident. Oh wait, I am. But in fact I think it’s just a hard problem.

Meta.ai Oh My! 18 Apr 2024, 7:00 pm

“Meet Your New Assistant” says the announcement, going on with “Built With Llama 3”. And oh my goodness has it ever got a lot of coverage. So I thought I might as well try it.

My first cut was a little unfair; I asked it about a subject on which I am unchallenged as the world’s leading expert: Tim Bray. (That’s probably overstating it: My wife is clearly in the running.)

So I asked meta.ai “What does Tim Bray think of Google?” Twice; once on my phone while first exploring the idea, and again later on my computer. Before I go on, I should remark that both user interfaces are first-rate: Friction-free and ahead of the play-with-AI crowd. Anyhow, here are both answers; it may be relevant that I was logged into my long-lived Facebook account:

meta.ai on Tim Bray and Google, take 1 meta.ai on Tim Bray and Google, take 2

The problem isn’t that these answers are really, really wrong (which they are). The problem is that they are terrifyingly plausible, and presented in a tone of serene confidence. For clarity:

  1. I am not a Computer Scientist. Words mean things.

  2. I worked for Google between March of 2010 and March of 2014.

  3. I was never a VP there nor did I ever have “Engineer” in my title.

  4. I did not write a blog post entitled “Goodbye, Google”. My exit post, Leaving Google, did not discuss advertising nor Google’s activities in China, nor in fact was it critical of anything about Google except for its choice of headquarters location. In fact, my disillusionment with Google (to be honest, with Big Tech generally) was slow to set in and really didn’t reach critical mass until these troubling Twenties.

  5. The phrase “advertising-based business model”, presented in quotes, does not appear in this blog. Quotation marks have meaning.

  6. My views are not, nor have they been, “complex and multifaceted”. I am embarrassingly mainstream. I shared the mainstream enchantment with the glamor of Big Tech until, sometime around 2020, I started sharing the mainstream disgruntlement.

  7. I can neither recall nor find instances of me criticizing Google’s decision-making process, nor praising its Open-Source activities.

What troubles me is that all of the actions and opinions attributed to meta.ai’s version of Tim Bray are things that I might well have done or said. But I didn’t.

This is not a criticism of Meta; their claims about the size and sophistication of their Llama3 model seem believable and, as I said, the interface is nifty.

Is it fair for me to criticize this particular product offering based on a single example? Well, first impressions are important. But for what it’s worth, I peppered it with a bunch of other general questions and the pattern repeats: Plausible narratives containing egregious factual errors.

I guess there’s no new news here; we already knew that LLMs are good at generating plausible-sounding narratives which are wrong. It comes back to what I discussed under the heading of “Meaning”. Still waiting for progress.

The nice thing about science is that it routinely features “error bars” on its graphs, showing both the finding and the degree of confidence in its accuracy.

AI/ML products in general don’t have them.

I don’t see how it’s sane or safe to rely on a technology that doesn’t have error bars.

Topfew Release 1.0 12 Apr 2024, 7:00 pm

Back in 2021-22, I wrote a series of blog posts about a program called “topfew” (tf from your shell command-line). It finds the field values (or combinations of values) which appear most often in a stream of records. I built it to explore large-scale data crunching in Go, and to investigate how performance compared to Rust. There was plentiful input, both ideas and code, from Dirkjan Ochtman and Simon Fell. Anyhow, I thought I was finished with it but then I noticed I was using the tf command more days than not, and I have pretty mainstream command-line needs. Plus I got a couple of random pings about whether it was still live. So I turned my attention back to it on April 12th and on May 2nd pushed v1.0.0.

GitHub sidebar for Topfew

I added one feature: You can provide a regex field separator to override the default space-separation that defines the fields in your records. Which will cost you a little performance, but you’re unlikely to notice.

Its test coverage is much improved and, expectedly, there are fewer bugs. Also, better docs.

Plan

I think it’s pretty much done, honestly can’t think any useful new features. At some point, I’ll look into Homebrew recipes and suchlike, if I get the feeling they might be used.

Obviously, please send issues or PRs if you see the opportunity.

Who needs this?

It’s mostly for log files I think. Whenever I’m poking around in one of those I find myself asking questions like “which API call was hit most often?” or “Which endpoint?” or “Which user agent?” or “Which subnet?”

The conventional hammer to drive this nail has always been something along the lines of:

awk '{print $7}' | sort | uniq -c | sort -nr | head 

Which has the advantage of Just Working on any Unix-descended computer. But can be slow when the input is big, and worse than linear too. Anyhow, tf is like that, only faster. In some cases, orders of magnitude faster. Plus, it has useful options that take care of the grep and sed idioms that often appear upstream in the pipe.

Topfew’s got a decent README so I’m not going invest any more words here in explaining it.

But it’s worth pointing out that it’s a single self-contained binary compiled from standalone Go source code with zero dependencies.

Performance

This subject is a bit vexed. After I wrote the first version, Dirkjan implemented it in Rust and it was way faster, which annoyed me because it ought to be I/O-bound. So I stole his best ideas and then Simon chipped in other good ones and we optimized more, and eventually it was at least as fast as the Rust version. Which is to say, plenty fast, and probably faster than what you’re using now.

But you only get the big payoff from all this work when you’re processing a file, as opposed to a stream; then tf feels shockingly fast, because it divides the file up into segments and scans them in parallel. Works remarkably well.

Unfortunately that doesn’t happen too often. Normally, you’re grepping for something or teeing off another stream or whatever. In which case, performance is totally limited by reading the stream; I’ve profiled the hell out of this and the actual tf code doesn’t show up in any of the graphs, just the I/O-related buffer wrangling and garbage collection. Maybe I’m missing something. But I’m pretty sure tf will keep up with any stream you can throw at it.

Tooling

Over the years I’ve become an adequate user of GitHub CI. It’s good to watch that ecosystem become richer and slicker; the things you need seem to be there and for an OSS hobbyist like me, are generally free. Still, it bothers me that Everything Is On GitHub. I need to become aware of the alternatives.

I still live in JetBrains-land, in this case specifically Goland, albeit unfashionably in Light mode. It scratches my itches.

Anyhow, everything is easier if you have no dependencies. And our whole profession needs to be more thoughtful about its dependencies.

Dirty secret

I’ve always wanted to ship a two-letter shell command that someone might use. Now I have. And I do think tf will earn a home in a few folks’ toolboxes.

OSQI 1 Apr 2024, 7:00 pm

I propose the formation of one or more “Open Source Quality Institutes”. An OSQI is a public-sector organization that employs software engineers. Its mission would be to improve the quality, and especially safety, of popular Open-Source software.

Why?

The XZ-Utils backdoor (let’s just say #XZ) launched the train of thought that led me to this idea. If you read the story, it becomes obvious that the key vulnerability wasn’t technical, it was the fact that a whole lot of Open-Source software is on the undermaintained-to-neglected axis, because there’s no business case for paying people to take care of it. Which is a problem, because there is a strong business case for paying people to attack it.

There are other essential human activities that lack a business case, for example tertiary education, potable water quality, and financial regulation. For these, we create non-capitalist constructs such as Universities and Institutes and Agencies, because society needs these things done even if nobody can make money doing them.

I think we need to be paying more attention to the quality generally, and safety especially, of the Open-Source software that has become the underlying platform for, more or less, our civilization. Thus OSQI.

They’re out to get us

For me, the two big lessons from #XZ were first, the lack of resources supporting crucial Open-Source infrastructure, but then and especially, the demonstration that the attackers are numerous, skilled and patient. We already knew about numerous and skilled but this episode, where the attacker was already well-embedded in the project by May 2022, opened a few eyes, including mine.

The advantage, to various flavors of malefactor, of subverting core pieces of Open-Source infrastructure, is incalculable. #XZ was the one we caught; how many have we missed?

What’s OSQI?

It’s an organization created by a national government. Obviously, more nations than one could have an OSQI.

The vast majority of the staff would be relatively-senior software engineers, with a small percentage of paranoid nontechnical security people (see below). You could do a lot with as few as 250 people, and the burdened cost would be trivial for a substantial government.

Since it is a matter of obvious fact that every company in the world with revenue of a billion or more is existentially dependent on Open Source, it would be reasonable to impose a levy of, say, 0.1% of revenue on all such companies, to help support this work. The money needn’t be a problem.

Structure

The selection of software packages that would get OSQI attention would be left to the organization, although there would be avenues for anyone to request coverage. The engineering organization could be relatively flat, most people giving individual attention to individual projects, then also ad-hoc teams forming for tool-building or crisis-handling when something like #XZ blows up.

Why would anyone work there?

The pay would be OK; less than you’d make at Google or Facebook, but a decent civil-service salary. There would be no suspicion that your employer is trying to enshittify anything; in fact, you’d start work in the morning confident that you’re trying to improve the world. The default work mode would be remote, so you could live somewhere a not-quite-Google salary would support a very comfortable way of life. There would be decent vacations and benefits and (*gasp*) a pension.

And there is a certain class of person who would find everyday joy in peeking and poking and polishing Open-Source packages that are depended on by millions of programmers and (indirectly) billions of humans. A couple of decades ago I would have been one.

I don’t think recruiting would be a problem.

So, what are OSQI’s goals and non-goals?

Goal: Safety

This has to come first. If all OSQI accomplishes is the foiling of a few #XZ-flavor attacks, and life becoming harder for people making them, that’s just fine.

Goal: Tool-building

I think it’s now conventional wisdom that Open Source’s biggest attack surfaces are dependency networks and build tools. These are big and complex problems, but let’s be bold and set a high bar:

Open-Source software should be built deterministically, verifiably, and reproducibly, from signed source-code snapshots. These snapshots should be free of generated artifacts; every item in the snapshot should be human-written and human-readable.

For example: As Kornel said, Seriously, in retrospect, #autotools itself is a massive supply-chain security risk. No kidding! But then everyone says “What are you gonna do, it’s wired into everything.”

There are alternatives; I know of CMake and Meson. Are they good enough? I don’t know. Obviously, GNU AutoHell can’t be swept out of all of the fœtid crannies where it lurks and festers, but every project from which it is scrubbed will present less danger to the world. I believe OSQI would have the scope to make real progress on this front.

Non-goal: Features

OSQI should never invest engineering resources in adding cool features to Open-Source packages (with the possible exception of build-and-test tools). The Open-Source community is bursting with new-features energy, most coming from people who either want to scratch their own itch or are facing a real blockage at work. They are way better positioned to make those improvements than anyone at OSQI.

Goal: Maintenance

Way too many deep-infra packages grow increasingly unmaintained as people age and become busy and tired and sick and dead. As I was writing this, a plea for help came across my radar from Sebastian Pipping, the excellent but unsupported and unfunded maintainer of Expat, the world’s most popular XML parser.

And yeah, he’s part of a trend, one that notably included the now-infamous XZ-Utils package.

And so I think one useful task for OSQI would be taking over (ideally partial) maintenance duties for a lot of Open-Source projects that have a high ratio of adoption to support. In some cases it would have to take a lower-intensity form, let’s call it “life support”, where OSQI deals with vulnerability reports but flatly refuses to address any requests for features no matter how trivial, and rejects all PRs unless they come from someone who’s willing to take on part of the maintenance load.

One benefit of having paid professionals doing this is that they will blow off the kind of social-engineering harassment that the #XZ attacker inflicted on the XZ-Utils maintainer (see Russ Cox’s excellent timeline) and which is unfortunately too common in the Open-Source world generally.

Goal: Benchmarking

Efficiency is an aspect of quality, and I think it would be perfectly reasonable for OSQI to engage in benchmarking and optimization. There’s a non-obvious reason for this: #XZ was unmasked when a Postgres specialist noticed performance problems.

I think that in general, if you’re a bad person trying to backdoor an Open-Source package, it’s going to be hard to do without introducing performance glitches. I’ve long advocated that unit and/or integration tests should include a benchmark or two, just to avert well-intentioned performance regressions; if they handicap bad guys too, that’s a bonus.

Goal: Education and evangelism

OSQI staff will develop a deep shared pool of expertise in making Open-Source software safer and better, and specifically in detecting and repelling multiple attack flavors. They should share it! Blogs, conferences, whatever. It even occurred to me that it might make sense to structure OSQI as an educational institution; standalone or as a grad college of something existing.

But what I’m talking about isn’t refereed JACM papers, but what my Dad, a Professor of Agriculture, called “Extension”: Bringing the results of research directly to practitioners.

Non-goal: Making standards

The world has enough standards organizations. I could see individual OSQI employees pitching in, though, at the IETF or IEEE or W3C or wherever, with work on Infosec standards.

Which brings me to…

Non-goal: Litigation

Or really any other enforcement-related activity. OSQI exists to fix problems, build tools, and share lessons. This is going to be easier if nobody (except attackers) sees them as a threat, and if staff don’t have to think about how their work and findings will play out in court.

And a related non-goal…

Non-goal: Licensing

The intersection between the class of people who’d make good OSQI engineers and those who care about Open-Source licenses is, thankfully, very small. I think OSQI should accept the license landscape that exists and work hard to avoid thinking about its theology.

Non-goal: Certification

Once OSQI exists, the notion of “OSQI-approved” might arise. But it’d be a mistake; OSQI should be an engineering organization; the cost (measured by required bureaucracy) to perform certification would be brutal.

Goal: Transparency

OSQI can’t afford to have any secrets, with the sole exception of freshly-discovered but still-undisclosed vulnerabilities. And when those vulnerabilities are disclosed, the story of their discovery and characterization needs to be shared entirely and completely. This feels like a bare-minimum basis for building the level of trust that will be required.

Necessary paranoia

I discussed above why OSQI might be a nice place to work. There will be a downside, though; you’ll lose a certain amount of privacy. Because if OSQI succeeds, it will become a super-high-value target for our adversaries. In the natural course of affairs, many employees would become committers on popular packages, increasing their attractiveness as targets for bribes or blackmail.

I recall once, a very senior security leader at an Internet giant saying to me “We have thousands of engineers, and my job requires me to believe that at least one of them also has another employer.”

So I think OSQI needs to employ a small number of paranoid traditional-security (not Infosec) experts to keep an eye on their colleagues, audit their finances, and just be generally suspicious. These people would also worry about OSQI’s physical and network security. Because attackers gonna attack.

Pronunciation

Rhymes with “bosky”, of course. Also, people who work there are OSQIans. I’ve grabbed “osqi.org” and will cheerfully donate it in the long-shot case that this idea gets traction.

Are you serious?

Yeah. Except for, I no longer speak with the voice of a powerful employer.

Look: For better or for worse, Open Source won. [Narrator: Obviously, for better.] That means it has become crucial civilizational infrastucture, which governments should actively support and maintain, just like roads and dams and power grids.

It’s not so much that OSQI, or something like it, is a good idea; it’s that not trying to achieve these goals, in 2024, is dangerous and insane.

A057X 30 Mar 2024, 7:00 pm

Yes, “A057X” is cryptic, but my new lens’s official monicker is “150-500mm F/5-6.7 Di III VC VXD” so let’s stick with that part number. It’s from Tamron and this is the Fujifilm X-Mount variation. Lens-geeking is my favorite part of photo-geeking and it’s great that more manufacturers are opening up to third-party lens builders.

Last May I wrote that I wanted a big-ass super-telephoto and now I have one. Let’s start with a little comparo. Here is (roughly) the same 2km-away landscape shot on the decades-old Tokina 400mm I’d been using since 2009, and on the new Tamron.

Distant waterfront, via 400mm Tokina Distant waterfront, via 150-500mm Tamron

If you care about this sort of thing you might want to enlarge these. A person is visible in the bottom picture, and another if you’re using Lightroom on a 4K screen.

Now let’s be honest; the color and flavor of the earlier picture is nicer, because the sun was just right; that’s why I strapped on the old glass. But the new-lens picture shows that yes, we do still make progress in analog technologies, and given the same light, there’d be more you could do with with today’s lens.

Anyhow, here’s what it looks like.

Tamron 150-500mm F/5-6.7 Di III VC VXD on Fujifilm XT-2

That’s on a Fujifilm X-T2, one of the bulkier of Fuji’s X-cameras. What’s not instantly obvious is that the camera and lens are sitting on the lens’s tripod shoe. That camera is now eight years old and needs to be replaced, but I’m not fully won over by the latest X-cams and the lens was an easier trigger to pull.

The reviews all said “Considering what it does, it’s amazingly small and light!” Maybe, but in fact it’s a big freakin’ heavy hunk of metal and glass. A tripod really helps.

For the birds

Tripod? But everyone seems to think that this kind of lens is for shooting birds in flight. So I took it to our cabin this weekend to test that hypothesis. Thus I learned that you really can’t shoot birds unless you’re hand-holding the camera. And even then, you can’t unless you’ve been practicing. I managed to get one picture of a bird in flight, but it was just a seagull and not a terribly handsome one either.

Then a couple of visitors settled at the top of a nearby Douglas Fir. Here’s one. Yes, the sky was that blue.

Bald eagle at the top of a cone-scattered evergreen

Isn’t it handsome? If you look close, though, its tail is jammed against a branch. But then it bent over to peer out at something.

Bald eagle at the top of an evergreen, leaning forward

Aren’t those feathers beautiful? This was a big-ass tree and I wasn’t right next to it, either. Yay Tamron.

Little, big

Turns out this thing can focus relatively close-in for an item of its ilk, so you can do, um what would one call it, macro-at-a-distance?

Close-up of a crocus flower among tangled botanical debris

That’s a teeny little blossom. But when I’m looking out over the water, I always end up taking pictures of the mountains on the other side.

Distant mountain

That one is a damn long way away. The picture suffers from being reduced to fit into your browser. I wish I could give everyone in the world Lightroom and a good 4K monitor.

Note that…

None of the pictures via this lens could have been captured on any mobile-phone camera in the world. You have to go pretty far these days to get into that territory.

Bye, Allyson 16 Mar 2024, 7:00 pm

She’s gone. She lived well. We’ll miss her.

Allyson McGrane Allyson McGrane

We’ve known Ms McGrane since 2005, when she was a co-conspirator on the wonderful little local Northern Voice blogging conference. We worked on other stuff together and hung out now and then and carpooled to the Prairies once and I can’t remember ever getting the slightest bit upset with her.

Here is a good-bye note from her partner Shane. If you want to leave a note somewhere, leave it there.

Ally (rhymes with “valley”) was a fine dog-parent and a strong grant-writer and a first-rate teacher and a connoisseur of fine cooking equipment and Canadian football. If you’ve been to much in the way of Vancouver theatre and dance events over the years, there’s a good chance that she produced the event or secured its funding or educated the people who did those things.

I remember having coffee with her a couple years ago, she advising me on one of my projects, laughing together at the Byzantine complexities of granting bureaucracies and the childlike money-obliviousness of arts leaders and the excellence of the coffee on that morning. Easy to be with.

Mesothelioma is a bitch; 8% 5-year survival rate, and there wasn’t that much they could do for her by the time they got the diagnosis right. We visited her last week and she was herself, cynical about her situation but it seemed more or less at peace.

I won’t miss her as much as the dogs will, but there’s still a gap in my life.

Play My Music 10 Mar 2024, 7:00 pm

When I’m away from home, I still want to listen to the music we have at home (well, I can live without the LPs). We had well over a thousand CDs so that’s a lot of music, 12,286 tracks ripped into Apple Lossless. Except for a few MP3s from, well, never mind. This instalment of the De-Google Project is about ways to do that with less Big-Tech involvement.

The former Google Play Music, now YouTube Music, allowed you to load your tunes into the cloud and play them back wherever your phone or computer happened to be. Except for it used to be easy to upload — just point the uploader at your iTunes library — and now it’s hard, and then Google removed YouTube Music’s shuffle-your-uploads feature from Android Auto. Also they fired a bunch of YouTube Music contractors who were trying to unionize. So screw ’em.

I discovered three plausible ways to do this. First and most simply, dump the tunes onto a USB drive; wherever you are in the world, you can usually plug one in and play tunes from it.

Second, there’s Plex; you run a Plex server on one of your computers at home (in our case a recent Mac Mini) which you point at music and video directories, and it’ll serve them to clients on the Web or on phones or on platforms like WebOS and Roku.

Also, it’ll serve your media to anywhere in the world, using UPnP to drill an outgoing hole through your firewall. Obviously, this could make a security-sensitive person nervous and does bother me a bit, because UPnP’s history has featured some nasty vulnerabilities. I have a to-do to check whether the version on my dumbass telco ISP router is reasonably safe. I believe that Tailscale would offer a better security posture, but don’t want one more thing to manage.

Finally, Apple Music can apparently do what YouTube Music does; let you upload your tunes into the cloud and play them anywhere. But moving from one Big-Tech provider to another doesn’t feel like progress.

Does it work?

Setting it up on Plex was a Just-Works experience. The process even reached out through our modern Eero mesh to the old telco router and convinced it to set up the appropriate UPnP voodoo. If you open the Plex server admin interface it occasionally complains about a double-NAT situation but works anyhow.

Getting the USB working was kind of hilarious. First of all, I bought a 512G USB stick. (My Mac says it only has 460GB, but what’s 50G between friends?) USB-A because that’s what the car has. It took a couple of hours to copy all the music onto it.

Then I plugged the USB stick into the car and it showed up instantly in the “Sources” tab of the media player, but greyed out. I snickered when I noticed that all the car infotainment menus were crawling and stuttering. Asking the car’s mighty electronic brain to index that mountain of music was making it sweat. Anyhow, after a few minutes, I could access the USB and now it works fine, mostly.

By “mostly”, I mean that when I tell it to play music off the USB, it takes a few seconds for the music to start, then a minute or more to get its shit together and present a coherent picture of what it’s playing. And on one occasion, the music player just randomly switched over to the radio. So I suspect my inventory is pushing the poor little toy computer in the car pretty hard. But once it’s going, the presentation is nice:

Jaguar infotainment showing current music and weather

A few items to note here:

  1. “Musick” is the name I gave the USB key.

  2. That recording is Jesus’ Blood Never Failed Me Yet, a truly unique piece of work by British composer Gavin Bryars. Opinions vary; I think it’s magical but it’s one of the few pieces of music that I am absolutely forbidden to play anywhere my wife can hear it.

  3. The car software is way more flexible than Android Auto; this is just one of the car’s three screens and there are a lot of options for distributing your music and weather and maps and climate control across them.

Which is better?

It’s complicated. Obviously, the USB option doesn’t require any network bandwidth. And I think the album-art presentation is nicer than Plex’s. (You can see that here).

The audio quality is pretty well a wash. Plex is a little louder, I suspect them of Loudness-War tactics, which is probably OK in a car with its inevitable background noise. Plex also crossfades the song transitions, clever and pleasing but really not essential.

Plex is really nice software and I feel a little guilty that I’m not sending them any money. They do have a “Pro” level of service; must check it out.

Then of course Plex needs Android Auto. Which on the one hand I’m probably going to be running a lot if I’m driving around town to appointments. But… Android Auto is already a little shaky some days, not sure whether it’s crashing or the car software is creaking or it’s just yet another lousy USB-C connection (I am developing a real hate for that form factor).

Realistically, given that our car (a Jaguar I-Pace EV) wasn’t a big seller and is five years old, can I really count on Google and Jaguar to do what it takes to keep Android Auto running?

At this point I need to say a big “Thanks!” to everyone on Fedi/Mastodon who gave me good advice on how to approach this problem.

Anyhow, as of now, we have two alternatives that work well. The De-Googling march continues forward.

The De-Google Project 9 Mar 2024, 8:00 pm

My family, like most, depends on a lot of online services. And again like most, a lot of those services come from Big Tech giants in general and (in our case) Google in particular. And like many people, we are becoming less comfortable with that. So I’m going to try to be systematic about addressing the problem. This post summarizes our dependencies and then I’ll post blog pieces about updates as I work my way through the list. (The first is already posted, see below.)

I’m calling this the “De-Google” project because they’re our chief supplier of this stuff and it’s more euphonious than “De-BigTechInGeneral”.

NeedSupplierAlternatives
Office Google Workspace ?
Data sharing Dropbox ?
Video meetings Google Meet Jitsi, ?
Maps Google Maps Magic Earth, Here, something OSM-based
Browser Apple Safari Firefox, ?
Search Google Bing-based options
Chat Signal
Photo editing Adobe Lightroom & Nik Capture One, Darktable, ?
In-car interface Google Android Auto Automaker software
Play my music Plex, USB
Discover music Google YouTube Music Qobuz, Tidal, Deezer, Pandora, ?
TV Prime, Roku, Apple, Netflix, TSN, Sportsnet ?

The “Supplier” color suggests my feelings about what I’m using, with blue standing for neutral.

Criteria

To replace the things that I’m unhappy with, I’m looking for some combination of:

  1. Open source

  2. Not ad-supported

  3. Not VC-funded

  4. Not Google, Apple, Microsoft, or Amazon

Office

We’ve been using Gmail for a really long time and are used to it, and the integration between mail and calendar and maps basically Just Works. The price is OK but it keeps going up, and so do our data storage requirements, what with all the cameras in the family. Finally, Google has stewardship of our lives and are probably monetizing every keystroke. We’re getting a bit creeped out over that.

I think that calendars and email are kind of joined at the hip, so we’d want a provider that does both.

As for online docs, I will not be sorry to shake the dust of Google Drive and Docs from my heels, I find them clumsy and am always having trouble finding something that I know is in there.

Data sharing

Dropbox is OK, assuming you ignore all the other stuff it’s trying to sell you. Maybe one of these years I should look at that other stuff and see if it’s a candidate to replace one or two other services?

Video meetings

I dislike lots of things about Zoom and find Microsoft Teams a pool of pain, but have been pretty happy with Google Meet. Nobody has to download or log into anything and it seems to more or less Just Work. But I’d look at alternatives.

Maps

As I wrote in 2017, Google maps aggregate directions, reviews, descriptions, phone numbers, and office hours. They are potentially a nuclear-powered monopoly engine. I use Maps more and more; if I want to contact or interact with something whose location I know, it’s way quicker to pull up Maps and click on their listing than it is to use Google search and fight through all the ads and spam.

The calendar integration is fabulous. If you have Android Auto and you’re going to a meeting, pull up the calendar app and tap on the meeting and it drops you right into directions.

The quality of the OpenStreetMap data is very good, but obviously they don’t have the Directions functions. Who does? Obviously, Here does, and I was enthused about it in 2019; but Android Auto’s music powers drew me back to Google Maps. Aside from that, Magic Earth is trying, and their business model seems acceptable, but the product was pretty rough-edged last time I tried it.

Browser

Safari is my daily driver. These days Chrome is starting to creep me out a bit; just doesn’t feel like it’s on my side. Also, it’s no longer faster than the competition. I’d like to shift over to Firefox one day when I have the energy

Then there are the Arcs and Braves and Vivaldis of this world, but I just haven’t yet invested the time to figure out if one of these will do, and I do not detect a wave of consensus out there.

By the way, DuckDuckGo has a browser, a shell over Safari on the Mac and Edge on Windows. Lauren uses it a lot. Probably worth a closer look.

Search

The decline of Google Search is increasingly in everyone’s face. Once again, it refuses to find things on this blog that I know are there.

Others in the family have already migrated to DuckDuckGo, and I now feel like an old-school lagger for still not having migrated off Google. I wish there were someone else taking a serious run at indexing the Web other than Bing — from yet another tech giant — but here we are.

Lauren tells me to have a closer look at Ecosia, which seems very wholesome.

Chat

At the moment you will have to pry Signal out of my cold, dead, hands. You should be using it too. ’Nuff said.

Photo editing

I pay my monthly tribute to Adobe, about whom my feelings aren’t as negative as they are about the mega Tech Giants. I’d like not to pay so much, and I’d like something that runs a little faster than Lightroom, and I’d like to support open source. But… I really like Lightroom, and sometimes one absolutely needs Photoshop, so I’m unlikely to prioritize this particular escape attempt.

In-car interface

Choices are limited. I see little point in migrating between Android Auto and CarPlay, which leaves the software the auto maker installed. Which, in my five-year-old Jaguar is… well, not bad actually. I think I could live with the built-in maps and directions from Here, even with the British Received Pronunciation’s butchery of North American place names.

But, I don’t know, we might stay with Android Auto. Check out this screenshot from my car.

Android Auto showing non-Google applications.

(Pardon the blurs and distortions.)

This is Android Auto displaying, as it normally does when I’m driving, maps and music. By default, Google Maps and YouTube Music. But not here; on the right is Plex, playing my own music stored on a Mac Mini at home.

On the left, it’s even more interesting: This is neither Google maps nor a competitor; it’s Gaia GPS, the app I normally use to mark trail while bushwhacking through Pacific Northwest rain forests. Somehow I fat-fingered it into place either in the car or on my phone.

The lesson here is that (for the moment at least) Android Auto seems to be genuinely neutral. It knows the general concepts of “apps that play music” and “apps that are maps” and is happy to display whichever ones you want, not just Google’s. (As a former Android geek who knows about Intents and Filters, I can see how this works. Clever.)

So far, Android Auto doesn’t show ads, but I suppose it’s monetizing me by harvesting traffic information to enrich its maps and I guess that’s a bargain I can live with. I use that data myself when I want to go somewhere and there are multiple routes and I can see which one is backed up by sewer work or whatever.

Discover music

I’ve been paying for YouTube Music since before it existed, and I’m genuinely impressed with the way its algorithm fishes up new artists that it turns out I really like. But just now Google laid off a bunch of YouTube Music “contractors” (de facto, employees) who tried to organize a union, so screw ’em.

I haven’t investigated any of the alternatives in depth yet.

Play my music

In the decades where Compact Disks were the way to acquire music, I acquired a lot. And ripped it. And pushed it up into Google’s musical cloud. And (until recently) could shuffle my musical life on YouTube Music. But they removed that feature from Android Auto, so screw ’em.

But I now have two good ways to do this. Check this out in Play My Music.

TV

The same gripe as everyone else: The streaming services have re-invented Cable TV, which I only got around to dumping a couple of years ago. The right solution is obvious: Pay-per-view at a reasonably low price, then the services could compete on producing great shows that people will pay to see, rather than sucking you into yet another subscription.

I suspect this column will stay red for quite a while. It’s amazing how much business leaders hate simple business models where there’s a clean clear one-time price for a product and customers have a clean clear choice who they buy their products from.

The path forward

I don’t know if I’ll ever turn the center column all-green. And I don’t need to; progress is progress. Anyhow, doing this sort of investigation is kind of fun.

Money Bubble 25 Feb 2024, 8:00 pm

I think I’m probably going to lose quite a lot of money in the next year or two. It’s partly AI’s fault, but not mostly. Nonetheless I’m mostly going to write about AI, because it intersects the technosphere, where I’ve lived for decades.

I’ve given up having a regular job. The family still has income but mostly we’re harvesting our savings, built up over decades in a well-paid profession. Which means that we are, willy-nilly, investors. And thus aware of the fever-dream finance landscape that is InvestorWorld.

The Larger Bubble

Put in the simplest way: Things have been too good for too long in InvestorWorld: low interest, high profits, the unending rocket rise of the Big-Tech sector, now with AI afterburners. Wile E. Coyote hasn’t actually run off the edge of the cliff yet, but there are just way more ways for things to go wrong than right in the immediate future.

If you want to dive a little deeper, The Economist has a sharp (but paywalled) take in Stockmarkets are booming. But the good times are unlikely to last. Their argument is that profits are overvalued by investors because, in recent years, they’ve always gone up. Mr Market ignores the fact that at least some of those gleaming profits are artifacts of tax-slashing by right-wing governments.

That piece considers the observation that “Many investors hope that AI will ride to the rescue” and is politely skeptical.

Popping the bubble

My own feelings aren’t polite; closer to Yep, you are living in a Nvidia-led tech bubble by Brian Sozzi over at Yahoo! Finance.

Sozzi is fair, pointing out that this bubble feels different from the cannabis and crypto crazes; among other things, chipmakers and cloud providers are reporting big high-margin revenues for real actual products. But he hammers the central point: What we’re seeing is FOMO-driven dumb money thrown at technology by people who have no hope of understanding it. Just because everybody else is and because the GPTs and image generators have cool demos. Sozzi has the numbers, looking at valuations through standard old-as-dirt filters and shaking his head at what he sees.

What’s going to happen, I’m pretty sure, is that AI/ML will, inevitably, disappoint; in the financial sense I mean, probably doing some useful things, maybe even a lot, but not generating the kind of profit explosions that you’d need to justify the bubble. So it’ll pop, and my bet it is takes a bunch of the finance world with it. As bad as 2008? Nobody knows, but it wouldn’t surprise me.

The rest of this piece considers the issues facing AI/ML,  with the goal of showing why I see it as a bubble-inflator and eventual bubble-popper.

First, a disclosure: I speak as an educated amateur. I’ve never gone much below the surface of the technology, never constructed a model or built model-processing software, or looked closely at the math. But I think the discussion below still works.

What’s good about AI/ML

Spoiler: I’m not the kind of burn-it-with-fire skeptic that I became around anything blockchain-flavored. It is clear that generative models manage to embed significant parts of the structure of language, of code, of pictures, of many things where that has previously not been the case. The understanding is sufficient to reliably accomplish the objective: Produce plausible output.

I’ve read enough Chomsky to believe that facility with language is a defining characteristic of intelligence. More than that, a necessary but not sufficient ingredient. I dunno if anyone will build an AGI in my lifetime, but I am confident that the task would remain beyond reach without the functions offered by today’s generative models.

Furthermore, I’m super impressed by something nobody else seems to talk about: Prompt parsing. Obviously, prompts are processed into a representation that reliably sends the model-traversal logic down substantially the right paths. The LLMbots of this world may regularly be crazy and/or just wrong, but they do consistently if not correctly address the substance of the prompt. There is seriously good natural-language engineering going on here that AI’s critics aren’t paying enough attention to.

So I have no patience with those who scoff at today’s technology, accusing it being a glorified Markov chain. Like the song says: Something’s happening here! (What it is ain’t exactly clear.)

It helps that in the late teens I saw neural-net pattern-matching at work on real-world problems from close up and developed serious respect for what that technology can do; An example is EC2’s Predictive Auto Scaling (and gosh, it looks like the competition has it too).

And recently, Adobe Lightroom has shipped a pretty awesome “Select Sky” feature. It makes my M2 MacBook Pro think hard for a second or two, but I rarely see it miss even an isolated scrap of sky off in the corner of the frame. It allows me, in a picture like this, to make the sky’s brightness echo the water’s.

Brightly-lit boats on dark water under a dark sky

And of course I’ve heard about success stories in radiology and other disciplines.

Thus, please don’t call me an “AI skeptic” or some such. There is a there there.

But…

Given that, why do I still think that the flood of money being thrown at this tech is dumb, and that most of it will be lost? Partly just because of that flood. When financial decision makers throw loads of money at things they don’t understand, lots of it is always lost.

In the Venture-Capital business, that’s an understood part of the business cycle; they’re looking to balance that out with a small number of 100x startup wins. But when big old insurance companies and airlines and so on are piling in and releasing effusive statements about building the company around some new tech voodoo, the outcome, in my experience, is very rarely good.

But let’s be specific.

Meaning

As I said above, I think the human mind has a large and important language-processing system. But that’s not all. It’s also a (slow, poorly-understood) computer, with access to a medium-large database of facts and recollections, an ultra-slow numeric processor, and facilities for estimation, prediction, speculation, and invention. Let’s group all this stuff together and call it “meaning”.

Have a look at Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data by Emily Bender and Alexander Koller (2020). I don’t agree with all of it, and it addresses an earlier generation of generative models, but it’s very thought-provoking. It postulates the “Octopus Test”, a good variation on the bad old Chinese-Room analogy. It talks usefully about how human language acquisition works. A couple of quotes: “It is instructive to look at the past to appreciate this question. Computational linguistics has gone through many fashion cycles over the course of its history” and “In this paper, we have argued that in contrast to some current hype, meaning cannot be learned from form alone.”

I’m not saying these problems can’t be solved. Software systems can be equipped with databases of facts, and who knows, perhaps some day estimation, prediction, speculation, and invention. But it’s not going to be easy.

Difficulty

I think there’s a useful analogy between the narratives around AI and of self-driving cars. As I write this, Apple has apparently decided that generative AI is easier than shipping an autonomous car. I’m particularly sensitive to this analogy because back around 2010, as the first self-driving prototypes were coming into view, I predicted, loudly and in public, that this technology was about to become ubiquitous and turn the economy inside out. Ouch.

There’s a pattern: The technologies that really do change the world tend to have strings of successes, producing obvious benefits even in their earliest forms, to the extent that geeks load them in the back doors of organizations just to get shit done. As they say, “The CIO is the last to know.”

Contrast cryptocurrencies and blockchains, which limped along from year to year, always promising a brilliant future, never doing anything useful. As to the usefulness of self-driving technology, I still think it’s gonna get there, but it’s surrounded by a cloud of litigation.

Anyhow, anybody who thinks that it’ll be easy to teach “meaning” (as I described it above) to today’s generative AI is a fool, and you shouldn’t give them your money.

Money and carbon

Another big problem we’re not talking about enough is the cost of generative AI. Nature offers Generative AI’s environmental costs are soaring — and mostly secret. In a Mastodon thread, @Quixoticgeek@social.v.st says We need to talk about data centres, and includes a few hard and sobering numbers.

Short form: This shit is expensive, in dollars and in carbon load. Nvidia pulled in $60.9 billion in 2023, up 126% from the previous year, and is heading for a $100B/year run rate, while reporting a 75% margin.

Another thing these articles don’t mention is that building, deploying, and running generative-AI systems requires significant effort from a small group of people who now apparently constitute the world’s highest-paid cadre of engineers. And good luck trying to hire one if you’re a mainstream company where IT is a cost center.

All this means that for the technology to succeed, it not only has to do something useful, but people and businesses will have to be ready to pay a significantly high price for that something.

I’m not saying that there’s nothing that qualifies, but I am betting that it’s not in ad-supported territory.

Also, it’s going to have to deal with pushback from unreasonable climate-change resisters like, for example, me.

Anyhow…

I kind of flipped out, and was motivated to finish this blog piece, when I saw this: “UK government wants to use AI to cut civil service jobs: Yes, you read that right.” The idea — to have citizen input processed and responded to by an LLM — is hideously toxic and broken; and usefully reveals the kind of thinking that makes morally crippled leaders all across our system love this technology.

The road ahead looks bumpy from where I sit. And when the business community wakes up and realizes that replacing people with shitty technology doesn’t show up as a positive on the financials after you factor in the consequences of customer rage, that’s when the hot air gushes out of the bubble.

It might not take big chunks of InvestorWorld with it. But I’m betting it does.

Social Photos 15 Feb 2024, 8:00 pm

I like taking pictures, and I like sharing pictures wherever I hang out online. A problem with this is knowing that the pictures will very rarely look as good in other people’s browsers and apps as they do to me in Lightroom on a big bright 4K screen. Thus this piece, a basic investigation of how photos are processed and transformed on Mastodon, Bluesky, and Threads.

I was never that much of an Instagram poster; Insta does a good job of taking your crappy phone pix and juicing them up with filters so they look way better. That’s irrelevant to me, because not only do I like taking pictures, I like polishing them with Lightroom and Silver Efex and so on. So with a few exceptions, everything I want to share gets pulled onto my Mac and edited before I share it. And once I’ve done that, why would I post pictures anywhere but where I have my normal conversations?

The picture

Here it is:

Montana from the air

Taken with a Pixel 7 out an airplane window somewhere just west of Havre, Montana. It seems like there are two layers of clouds at the left of the picture but if you look closely, the lower one I think is the Rocky Mountains in the distance.

That’s a big picture, both in its subject and raw size: The Pixel version, after editing, is 3814x2290. Also it has a lot of fine detail, and rewards zooming in. When I post it, I’d like some sense of the bigness to come across, and when tapped to enlarge, I’d like it to wow people a little, especially those fortunate enough to be looking at big screens. And I’d like it to be at least OK on your phone.

Normally, pictures here in the blog are limited to max 720x720 in the column of text, and the larger version you get by clicking to 1440x960. But in this case, if you click you get a 2558x1536 version, the objective being that that’ll be big enough to fill almost any screen it gets viewed on.

Methodology and apology

The question I want to investigate is, “which platforms are going to make my pictures look good?” But I haven’t really figured out yet how to do that. To start with, what kind of picture is going to do best as a metric to judge the processing quality?

Anyhow, I picked this one and posted it to Mastodon, Bluesky, and Threads, and here gather data about the results. But hey, why not tap those links on whatever device you’re using right now and see what you think about how the picture looks there?

The columns are:

  1. Bytes: the size of the photo as downloaded.

  2. WxH: width and height, in pixels.

  3. “Q”: the JPG quality, as reported by Imagemagick’s identify --verbose. The quotes are there because I’m not sure how to interpret, or even whether it’s any use at all.

BytesWxH“Q”
Original16715142558 153694
Blog form6249611440  86594
BlueskyAndroid FS302972864  663
BlueskyAndroid mini42410345  345
BlueskyWeb FS5363452000 120180
BlueskyWeb mini1123351000  60180
MastodonWeb FS15551112558 153690
MastodonWeb mini86374619  37290
PhanpyWeb FS15551112558 153690
PhanpyWeb mini86374619  37290
ThreadsWeb FS8880672160 129790
ThreadsWeb mini8880672160 129790

Note that each of the posts included not one but two pictures, because I was also interested in how the platforms allocated screen space. The platforms typically have two display modes, “mini”, as shown in the feed, and “FS” for Full Size, what you get when you click on the picture.

Original/Blog form

I think that ideally, I’d like each platform’s presentation of the picture, when you click on it, to have the same number of pixels and for each pixel to have the same color value, as in my original.

Bluesky

First of all are numbers from the Android app, but please don’t take them seriously. The process of extracting them from the Pixel’s screen and getting them onto my Mac involved multiple irritating steps, each one of which may have damaged the bits. So I didn’t repeat the exercise for the other platforms. They are mostly here to encourage me, should I pursue this further, to find a good clean way to extract this information.

I do note, however, that the “mini” form in the Bluesky Android feed really crushes those poor little pictures down and, for this particular picture, offers no suggestion that it’s big.

The Web version of Bluesky does not preserve my pixels, but coerces the size down to 2K and 1K width in FS and mini versions.

Mastodon and Phanpy

Phanpy is an alternate client for Mastodon; I think it’s very good and it’s my daily driver. The table reveals that, in this case, the alternate client pulls in the same images as the official Web client, which is good.

It also reveals that Mastodon preserves the picture’s dimensions, but obviously reprocesses it somehow, because the photos grow (somewhat) smaller. I wish they didn’t do that. It’s open-source, I should peek in and see what they actually do.

Phanpy does a better job of actually showing the pictures in-feed than the official Mastodon client, and both are nicer than Bluesky.

Threads

I had difficulty, because the Threads Web client is a tangly JavaScript fever dream, so it’s really hard to get at the underlying photos, but my efforts suggested that it uses the same picture for the “mini” and “FS” versions, just getting the browser to scale them down.

Furthermore, Threads doesn’t want pictures to be more than 2160 pixels wide.

Maybe I’m wrong

Because the experimental work was manual and thus highly prone to fumblefingers and brain farts. If you think that any of these numbers are wrong, you may be right; please yell at me.

I hesitate to offer a conclusion because this is, as noted at the top, the first steps in what could be a large and interesting research project, one that I probably don’t have the expertise to conduct. But, here are a few anyhow.

First, they all do a pretty good job. Second, none of them actually offer an opportunity to view my bits exactly as uploaded, which I think they should. Third, client designers should follow Phanpy’s lead in figuring out how to make better use of screen real-estate to highlight images.

What Lewis Carroll Said

And I quote: “What is the use of a book,” thought Alice, “without pictures or conversations?”

Page processed in 0.034 seconds.

Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2024, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.