Rendered at 13:20:22 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
ChicagoBoy11 14 minutes ago [-]
For anyone who liked this, I highly suggest you take a look at the CuriousMarc youtube channel, where he chronicles lots of efforts to preserve and understand several parts of the Apollo AGC, with a team of really technically competent and passionate collaborators.
One of the more interesting things they have been working on, is a potential re-interpretation of the infamous 1202 alarm. It is, as of current writing, popularly described as something related to nonsensical readings of a sensor which could (and were) safely ignored in the actual moon landing. However, if I remember correctly, some of their investigation revealed that actually there were many conditions which would cause that error to have been extremely critical and would've likely doomed the astronauts. It is super fascinating.
deepsun 10 minutes ago [-]
And that's why it's harder (or easier?) to make the same landing again -- we taking way less chances. Today we know of way more failure modes than back then.
jwpapi 1 hours ago [-]
Has someone verified this was an actual bug?
One of AI’s strengths is definitely exploration, f.e. in finding bugs, but it still has a high false positive rate. Depending on context that matters or it wont.
Also one has to be aware that there are a lot of bugs that AI won’t find but humans would
I don’t have the expertise to verify this bug actually happened, but I’m curious.
throwaway27448 21 minutes ago [-]
It's not even clear if AI was used to find the bug: they mention modeling the software with an "ai native" language, whatever that means. What is not clear is how they found themselves modeling the gyros software of the apollo code to begin with.
But, I do think their explanation of the lock acquisition and the failure scenario is quite clear and compelling.
jll29 7 minutes ago [-]
> It's not even clear if AI was used to find the bug: they mention modeling the software with an "ai native" language, whatever that means.
Could the "AI native language" they used be Apache Drools?
The "when" syntax reminded me of it...
(Apache Drools is an open source rule language and interpreter to declaratively formulate and execute rule-based specifications; it easily integrates with Java code.)
caminante 5 minutes ago [-]
How did you pick out AI native and miss the rest of the SAME sentence?
> We found this defect by distilling a behavioural specification of the IMU subsystem using Allium, an AI-native behavioural specification language.
Qwuke 6 minutes ago [-]
>It's not even clear if AI was used to find the bug
It's not even clear you read the article
caminante 3 minutes ago [-]
Even worse, the other child comments are speculating (and didn't RTFA either) when the answer is clear in the article.
> We found this defect by distilling a behavioural specification of the IMU subsystem using Allium, an AI-native behavioural specification language.
Aurornis 11 minutes ago [-]
> It's not even clear if AI was used to find the bug
The intro says “We used Claude and Allium”. Allium looks like a tool they’ve built for Claude.
So the article is about how they used their AI tooling and workflow to find the bug.
josephg 2 hours ago [-]
Super interesting. I wish this article wasn’t written by an LLM though. It feels soulless and plastic.
The AI writing detectors are very unreliable. This is important to mention because they can trigger in the opposite direction (reporting human written text as AI generated) which can result in false accusations.
It’s becoming a problem in schools as teachers start accusing students of cheating based on these detectors or ignore obvious signs of AI use because the detectors don’t trigger on it.
croemer 52 minutes ago [-]
Pangram doesn't reliably detect individual LLM-generated phrases or paragraphs among human written text.
It seems to look at sections of ~300 words. And for one section at least it has low confidence.
I tested it by getting ChatGPT to add a paragraph to one of my sister comments. Result is "100% human" when in fact it's only 75% human.
Then pangram isn't very good, because that article is full of Claude-isms.
embedding-shape 1 hours ago [-]
> because that article is full of Claude-isms
Not sure how I feel about the whole "LLMs learned from human texts, so now the people who helped write human texts are suddenly accused of plagiarizing LLMs" thing yet, but seems backwards so far and like a low quality criticism.
xmcqdpt2 8 minutes ago [-]
I'm sure some human writers would write:
> The specification forces this question on every path through the IMU mode-switching code. A reviewer examining BADEND would see correct, complete cleanup for every resource BADEND was designed to handle.
> The specification approaches from the other direction: starting from LGYRO and asking whether any paths fail to clear it.
> *Tests verify the code as written; a behavioural specification asks what the code is for.*
However this is a blog post about using Claude for XYZ, from an AI company whose tagline is
"AI-assisted engineering that unlocks your organization's potential"
Do you really think they spent the time required to actually write a good article by hand? My guess is that they are unlocking their own organizations potential by having Claude writes the posts.
snapcaster 37 minutes ago [-]
Real talk. You're not just making a good point -- you're questioning the dominant paradigm
jnwatson 29 minutes ago [-]
Horrible
DiffTheEnder 1 hours ago [-]
Is it possible for a tool to know if something is AI written with high confidence at all? LLMs can be tuned/instructed to write in an infinite number of styles.
Don't understand how these tools exist.
gcr 57 minutes ago [-]
The WikiEDU project has some thoughts on this. They found Pangram good enough to detect LLM usage while teaching editors to make their first Wikipedia edits, at least enough to intervene and nudge the student. They didn’t use it punatively or expect authoritative results however. https://wikiedu.org/blog/2026/01/29/generative-ai-and-wikipe...
They found that Pangram suffers from false positives in non-prose contexts like bibliographies, outlines, formatting, etc. The article does not touch on Pangram’s false negatives.
I personally think it’s an intractable problem, but I do feel pangram gives some useful signal, albeit not reliably.
cameronh90 1 hours ago [-]
It has Claude-isms, but it doesn't feel very Claude-written to me, at least not entirely.
What's making it even more difficult to tell now is people who use AI a lot seem to be actively picking up some of its vocab and writing style quirks.
timdiggerm 37 minutes ago [-]
So you're saying Pangram isn't worth much?
ChrisRR 2 hours ago [-]
It's not setting off any LLM alarm bells to me. It just reads like any other scientific article, which is very often soulless
embedding-shape 2 hours ago [-]
Any specific sections that stick out? Juxt in the past had really great articles, even before LLMs, and know for a fact they don't lack the expertise or knowledge to write for themselves if they wanted and while I haven't completely read this article yet, I'd surprise me if they just let LLMs write articles for them today.
croemer 2 hours ago [-]
Here's one tell-tale of many: "No alarm, no program light."
Another one: "Two instructions are missing: [...] Four bytes."
One more: "The defensive coding hid the problem, but it didn’t eliminate it."
monooso 2 hours ago [-]
That's just writing. I frequently write like that.
This insistence that certain stylistics patterns are "tell-tale" signs that an article was written by AI makes no sense, particularly when you consider that whatever stylistic ticks an LLM may possess are a result of it being trained on human writing.
croemer 2 hours ago [-]
These are just some of the good examples I found.
My hunch that this is substantially LLM-generated is based on more than that.
In my head it's like a Bayesian classifier, you look at all the sentences and judge whether each is more or less likely to be LLM vs human generated. Then you add prior information like that the author did the research using Claude - which increases the likelihood that they also use Claude for writing.
Maybe your detector just isn't so sensitive (yet) or maybe I'm wrong but I have pretty high confidence at least 10% of sentences were LLM-generated.
Yes, the stylistic patterns exist in human speech but RLHF has increased their frequency. Also, LLM writing has a certain monotonicity that human writing often lacks. Which is not surprising: the machine generates more or less the most likely text in an algorithmic manner. Humans don't. They wrote a few sentences, then get a coffee, sleep, write a few more. That creates more variety than an LLM can.
Here's an alternative way of thinking about this...
Someone probably expended a lot of time and effort planning, thinking about, and writing an interesting article, and then you stroll by and casually accuse them of being a bone idle cheat, with no supporting evidence other than your "sensitive detector" and a bunch of hand-wavy nonsense that adds up to naught.
xmcqdpt2 2 minutes ago [-]
To start, this is more or less an advertising piece for their product. It's pretty clear that they want to sell you Allium. And that's fine! They are allowed! But even if that was written by a human, they were compensated for it. They didn't expend lots of effort and thinking, it's their job.
More importantly, it's an article about using Claude from a company about using Claude. I think on the balance it's very likely that they would use Claude to write their technical blog posts.
kenjackson 21 minutes ago [-]
While I agree with the sentiment, using AI to write the final draft of the article isn’t cheating. People may not like it, but it’s more a stylistic preference.
bookofjoe 14 minutes ago [-]
Yet another way the mere possibility of AI/LLM being involved diminishes the value of ALL text.
If there is constant vigilance on the part of the reader as to how it was created, meaning and value become secondary, a sure path to the death of reading as a joy.
oscaracso 1 hours ago [-]
I am reminded of the Simpsons episode in which Principal Skinner tries to pass off the hamburgers from a near-by fast food restaurant for an old family recipe, 'steamed hams,' and his guest's probing into the kitchen mishaps is met with increasingly incredible explanations.
brookst 16 minutes ago [-]
I’m so glad the witch hunt has moved on to phrasing so I get less grief for my em dashes.
In theory, wouldn't be too hard be to settle the question if whether he used ChatGPT to write it: get Olang to write a few paragraphs by hand, then have people judge (blindly) if it's the same style as the article. Which one sounds more like ChatGPT.
embedding-shape 57 minutes ago [-]
The times I've written articles, and those have gone through multiple rounds of reviews (by humans) with countless edits each time, before it ends up being published, I wonder if I'd pass that test in those cases. Initial drafts with my scattered thoughts usually are very different from the published end results, even without involving multiple reviewers and editors.
360MustangScope 2 hours ago [-]
I hate that I can’t write em dashes freely anymore without people accusing the writing of being AI generated.
Even though they are perfect for usage in writing down thoughts and notes.
croemer 2 hours ago [-]
I have nothing against em dashes. As long as your writing is human, experienced readers will be able to tell it's human. Only less experienced ones will use all or nothing rules. Em dashes just increase the likelihood that the text was LLM generated. They aren't proof.
brookst 12 minutes ago [-]
That nuance is lost on the majority of anti-AI folks who’ve learned they get positive social reactions by declaring essentially everything to be AI written and condemnable.
“An em dash… they’re a witch!”… “it’s not just X, it’s Y… they’re a witch!”
butlike 16 minutes ago [-]
[dead]
tapoxi 1 hours ago [-]
This is my exact writing style - I'm screwed.
croemer 1 hours ago [-]
I doubt you write like that. Where can I find your writing other than your comments which IMO don't read like the blog post?
TruffleLabs 1 hours ago [-]
This is just writing; terse maybe and maybe not grammatically correct, but people write like that.
croemer 1 hours ago [-]
It's not just terseness, it's the rhythm and "it's not x, it's y".
In fact, the latter is the opposite of terseness. LLMs love to tell you what things are not way more than people do.
(The irony that I started with "it's not just" isn't lost on me)
TruffleLabs 1 hours ago [-]
"Written by an LLM" based on what data or symptom?
ModernMech 2 hours ago [-]
I'm starting to develop a physiological response when I recognize AI prose. Just like an overwhelming frustration, as if I'm hearing nails on chalkboard silently inside of my head.
voodooEntity 2 hours ago [-]
I feel ya.... and i have to admit in the past i tried it for one article in my own blog thinking it might help me to express... tho when i read that post now i dont even like it myself its just not my tone.
therefor decided not gonne use any llm for blogging again and even tho it takes alot more time without (im not a very motivated writer) i prefer to release something that i did rather some llm stuff that i wouldnt read myself.
rudhdb773b 56 minutes ago [-]
Not to single out your comment, but it feels like it's gotten to the point where HN could use a rule against complaining about AI generated content.
It seems like almost every discussion has at least someone complaining about "AI slop" in either the original post or the comments.
Aurornis 4 minutes ago [-]
I disagree. I like to read articles and explore Show HN posts, but in the past 6 months I’ve wasted a lot of time following HN links that looked interesting but turned out to be AI slop. Several Show HN posts lately have taken me to repos that were AI generated plagiarisms of other projects, presented on HN as their own original ideas.
Seeing comments warning about the AI content of a link is helpful to let others know what they’re getting into when they click the link.
For this article the accusations are not about slop (which will waste your time) but about tell-tell signs of AI tone. The content is interesting but you know someone has been doing heavy AI polishing, which gives articles a laborious tone and has a tendency to produce a lot of words around a smaller amount of content (in other words, you’re reading an AI expansion of someone’s smaller prompt, which contained the original info you’re interested in)
Being able to share this information is important when discussing links. I find it much more helpful than the comments that appear criticizing color schemes, font choices, or that the page doesn’t work with JavaScript disabled.
Gigachad 14 minutes ago [-]
HN has gotten to the point where it’s not even worth clicking the link because of course it’s ai slop.
There is some real content in the haystack, but we almost need some kind of curator to find and display it rather than a vote system where most people vote on the title alone.
brookst 8 minutes ago [-]
If you’re looking for a place that surfaces only human-written content regardless of whether it’s interesting, rather than interesting content regardless of how it was written, HN is not the place.
There might be a market for your alternative though. Should be easy enough to build with Claude Code.
NiloCK 1 hours ago [-]
This is the top reply on a substantial percentage of HN posts now and we should discourage it.
It is:
- sneering
- a shallow dismissal (please address the content)
- curmudgeonly
- a tangential annoyance
All things explicitly discouraged in the site guidelines. [1]
Downvoting is the tool for items that you think don't belong on the front page. We don't need the same comment on every single article.
It's not a shallow dismissal; it's a dismissal for good reason. It's tangential to the topic, but not to HN overall. It's only curmudgeonly if you assume AI-written posts are the inevitable and good future (aka begging the question). I really don't know how it's "sneering", so I won't address that.
masklinn 58 minutes ago [-]
> Downvoting is the tool for items that you think don't belong on the front page.
You can’t downvote submissions. That’s literally not a feature of the site. You can only flag submissions, if you have more that 31 karma.
NiloCK 45 minutes ago [-]
Twelve year old account and who knows how much lurking before that and I've never noticed this. Good lord.
Optimistically, I guess I can call myself some sort of live-and-let-live person.
monooso 1 hours ago [-]
No idea why you're being downvoted. I've done my bit to redress the balance, I hope others do the same.
mpalmer 1 hours ago [-]
I've seen way, way worse. Either someone LLM-polished something they already wrote, or they did their own manual editing pass.
The short sentence construction is the most suspicious, but I actually don't see anything glaring. It normally jumps out and hits me in the face.
it's actually the second one I read that fit that description.
retard3 2 hours ago [-]
[flagged]
chrisjj 7 minutes ago [-]
[delayed]
riverforest 48 minutes ago [-]
Software that ran on 4KB of memory and got humans to the moon still has undiscovered bugs in it. That says something about the complexity hiding in even the smallest codebases.
whiplash451 20 minutes ago [-]
My guess is that in such low memory regimes, program length is very loosely correlated with bug rate.
If anything, if you try to cram a ton of complexity into a few kb of memory, the likelihood of introducing bugs becomes very high.
MeteorMarc 17 minutes ago [-]
Are there any consequences for the Artemis 2 mission (ironic)?
wg0 1 hours ago [-]
Someone please amend the title and add "using claude code" because that's customary nowadays.
yodon 2 hours ago [-]
This is so insightfully and powerfully written I had literal chills running down my spine by the end.
What a horrible world we live in where the author of great writing like this has to sit and be accused of "being AI slop" simply because they use grammar and rhetoric well.
dotancohen 1 hours ago [-]
I was completely riveted the whole read. The description of Collins' dilemma is the first time I've seen an actual real world scenario described that might cause him to return to Earth alone.
If an LLM wrote that, then I no longer oppose LLM art.
breakingcups 22 minutes ago [-]
I thought that was the least likeable part of the article. They speculated wildly, somehow making the leap that a trained astronaut would not resort to a computer reset if the problems persisted to weave the narrative that this bug was super-duper-serious indeed. They didn't need that and it weakened the presentation.
One of the more interesting things they have been working on, is a potential re-interpretation of the infamous 1202 alarm. It is, as of current writing, popularly described as something related to nonsensical readings of a sensor which could (and were) safely ignored in the actual moon landing. However, if I remember correctly, some of their investigation revealed that actually there were many conditions which would cause that error to have been extremely critical and would've likely doomed the astronauts. It is super fascinating.
One of AI’s strengths is definitely exploration, f.e. in finding bugs, but it still has a high false positive rate. Depending on context that matters or it wont.
Also one has to be aware that there are a lot of bugs that AI won’t find but humans would
I don’t have the expertise to verify this bug actually happened, but I’m curious.
But, I do think their explanation of the lock acquisition and the failure scenario is quite clear and compelling.
Could the "AI native language" they used be Apache Drools? The "when" syntax reminded me of it...
https://kie.apache.org/docs/10.0.x/drools/drools/language-re...
(Apache Drools is an open source rule language and interpreter to declaratively formulate and execute rule-based specifications; it easily integrates with Java code.)
> We found this defect by distilling a behavioural specification of the IMU subsystem using Allium, an AI-native behavioural specification language.
It's not even clear you read the article
> We found this defect by distilling a behavioural specification of the IMU subsystem using Allium, an AI-native behavioural specification language.
The intro says “We used Claude and Allium”. Allium looks like a tool they’ve built for Claude.
So the article is about how they used their AI tooling and workflow to find the bug.
It’s becoming a problem in schools as teachers start accusing students of cheating based on these detectors or ignore obvious signs of AI use because the detectors don’t trigger on it.
It seems to look at sections of ~300 words. And for one section at least it has low confidence.
I tested it by getting ChatGPT to add a paragraph to one of my sister comments. Result is "100% human" when in fact it's only 75% human.
Pangram test result: https://www.pangram.com/history/1ee3ce96-6ae5-4de7-9d91-5846...
ChatGPT session where it added a paragraph that Pangram misses: https://chatgpt.com/share/69d4faff-1e18-8329-84fa-6c86fc8258...
Not sure how I feel about the whole "LLMs learned from human texts, so now the people who helped write human texts are suddenly accused of plagiarizing LLMs" thing yet, but seems backwards so far and like a low quality criticism.
> The specification forces this question on every path through the IMU mode-switching code. A reviewer examining BADEND would see correct, complete cleanup for every resource BADEND was designed to handle.
> The specification approaches from the other direction: starting from LGYRO and asking whether any paths fail to clear it.
> *Tests verify the code as written; a behavioural specification asks what the code is for.*
However this is a blog post about using Claude for XYZ, from an AI company whose tagline is
"AI-assisted engineering that unlocks your organization's potential"
Do you really think they spent the time required to actually write a good article by hand? My guess is that they are unlocking their own organizations potential by having Claude writes the posts.
Don't understand how these tools exist.
They found that Pangram suffers from false positives in non-prose contexts like bibliographies, outlines, formatting, etc. The article does not touch on Pangram’s false negatives.
I personally think it’s an intractable problem, but I do feel pangram gives some useful signal, albeit not reliably.
What's making it even more difficult to tell now is people who use AI a lot seem to be actively picking up some of its vocab and writing style quirks.
Another one: "Two instructions are missing: [...] Four bytes."
One more: "The defensive coding hid the problem, but it didn’t eliminate it."
This insistence that certain stylistics patterns are "tell-tale" signs that an article was written by AI makes no sense, particularly when you consider that whatever stylistic ticks an LLM may possess are a result of it being trained on human writing.
My hunch that this is substantially LLM-generated is based on more than that.
In my head it's like a Bayesian classifier, you look at all the sentences and judge whether each is more or less likely to be LLM vs human generated. Then you add prior information like that the author did the research using Claude - which increases the likelihood that they also use Claude for writing.
Maybe your detector just isn't so sensitive (yet) or maybe I'm wrong but I have pretty high confidence at least 10% of sentences were LLM-generated.
Yes, the stylistic patterns exist in human speech but RLHF has increased their frequency. Also, LLM writing has a certain monotonicity that human writing often lacks. Which is not surprising: the machine generates more or less the most likely text in an algorithmic manner. Humans don't. They wrote a few sentences, then get a coffee, sleep, write a few more. That creates more variety than an LLM can.
Fun exercise: https://en.wikipedia.org/wiki/Wikipedia:AI_or_not_quiz
Someone probably expended a lot of time and effort planning, thinking about, and writing an interesting article, and then you stroll by and casually accuse them of being a bone idle cheat, with no supporting evidence other than your "sensitive detector" and a bunch of hand-wavy nonsense that adds up to naught.
More importantly, it's an article about using Claude from a company about using Claude. I think on the balance it's very likely that they would use Claude to write their technical blog posts.
If there is constant vigilance on the part of the reader as to how it was created, meaning and value become secondary, a sure path to the death of reading as a joy.
For what it’s worth, Pangram reports that Marcus’ article is 100% LLM-written: https://www.pangram.com/history/640288b9-e16b-4f76-a730-8000...
Even though they are perfect for usage in writing down thoughts and notes.
“An em dash… they’re a witch!”… “it’s not just X, it’s Y… they’re a witch!”
In fact, the latter is the opposite of terseness. LLMs love to tell you what things are not way more than people do.
See https://www.blakestockton.com/dont-write-like-ai-1-101-negat...
(The irony that I started with "it's not just" isn't lost on me)
therefor decided not gonne use any llm for blogging again and even tho it takes alot more time without (im not a very motivated writer) i prefer to release something that i did rather some llm stuff that i wouldnt read myself.
It seems like almost every discussion has at least someone complaining about "AI slop" in either the original post or the comments.
Seeing comments warning about the AI content of a link is helpful to let others know what they’re getting into when they click the link.
For this article the accusations are not about slop (which will waste your time) but about tell-tell signs of AI tone. The content is interesting but you know someone has been doing heavy AI polishing, which gives articles a laborious tone and has a tendency to produce a lot of words around a smaller amount of content (in other words, you’re reading an AI expansion of someone’s smaller prompt, which contained the original info you’re interested in)
Being able to share this information is important when discussing links. I find it much more helpful than the comments that appear criticizing color schemes, font choices, or that the page doesn’t work with JavaScript disabled.
There is some real content in the haystack, but we almost need some kind of curator to find and display it rather than a vote system where most people vote on the title alone.
There might be a market for your alternative though. Should be easy enough to build with Claude Code.
It is:
- sneering
- a shallow dismissal (please address the content)
- curmudgeonly
- a tangential annoyance
All things explicitly discouraged in the site guidelines. [1]
Downvoting is the tool for items that you think don't belong on the front page. We don't need the same comment on every single article.
[1] - https://news.ycombinator.com/newsguidelines.html
You can’t downvote submissions. That’s literally not a feature of the site. You can only flag submissions, if you have more that 31 karma.
Optimistically, I guess I can call myself some sort of live-and-let-live person.
The short sentence construction is the most suspicious, but I actually don't see anything glaring. It normally jumps out and hits me in the face.
1. Use Short Sentences
https://www.wordsthatsing.com.au/post/hemingway-rules
If anything, if you try to cram a ton of complexity into a few kb of memory, the likelihood of introducing bugs becomes very high.
What a horrible world we live in where the author of great writing like this has to sit and be accused of "being AI slop" simply because they use grammar and rhetoric well.
If an LLM wrote that, then I no longer oppose LLM art.