keiferski 2 days ago [–]

I think if one designed a “crisis-proof” version of the web, it might end up being a network of PDFs. My reasoning being:

PDFs are universally understood by most people and can be read on phones, desktops, laptops, and eBook readers.
Once you’ve downloaded a local PDF version of the site, there is no risk that it can be changed or removed by the host.
File size is predictable ahead of time, which is useful if your connection is limited or slow.
PDFs are designed for printing (moreso than most sites) which may be useful in situations where electricity is in low supply. reply

Tajnymag 2 days ago [–]

Except the printing part, all can be said about a standalone html file. External content, like images, can be inlined, thus you would only have to distribute one single .html file. I'm not sure how would file2file linking work in the realm of pdf files. With html files, it's easy even without any web server. Plus, html can be even digested through a terminal interface. That cannot be said about the binary nature of pdf documents. reply

keiferski 2 days ago [–]

This is true, but I do think a PDF is just conceptually simpler and requires less technical knowledge. Especially in a situation where technical users are scarce. IMO most people have a mental model of a PDF as being a digital document, whereas a HTML file is somewhat more amorphous. reply

nulbyte 2 days ago [–]

I use a terminal pager with PDFs quite frequently. It works surprisingly well. Even something you wouldn't expect, like a pay stub, renders fine in the terminal. reply

bashinator 2 days ago [–]

What do you use to display the pdf? Pandoc? reply

dredmorbius 1 day ago [–]

Pandoc is a converter, not a viewer. Typically one would view a PDF with a dedicated viewer (xpdf, zathura, kpdf, Okular, Evince, ...), or by converting to text (pdftotext). less can be extended with hooks (lesspipe) to read a wide range of file types on the console. Some console file managers can also translate PDFs to text (mc, ranger). reply

lucideer 2 days ago [–]

PDFs are universally understood by most people and can be read on phones, desktops, laptops, and eBook readers. PDFs need a proprietary app to use, most of which are loaded with spyware & trackers. I may be mistaken in this but MacOS/iOS are the only OSes I know of that read them natively? There's absolutely nothing universal about the format. HTML is truly universal: not only does every OS come with a built in HTML viewer, but it's a plain text file. You can read the source using anything. Once you’ve downloaded a local PDF version of the site, there is no risk that it can be changed or removed by the host. Once you've downloaded a local HTML version of the page there's no risk that it can be changed or removed by the host. Yes, there's caveats to both: people can create PDFs with remote embeds or HTML sites with ajax content but both of these are the fault/responsibility of the individual author. It's as easy to make good downloadable HTML as downloadable PDF. The so called "churn" is the responsibility of the individual HTML author. If you're making bad HTML, the fix is to start making good HTML. Not to switch to a closed inaccessible format. reply

npteljes 1 day ago [–]

PDF is an open format, with multiple FOSS reader implementations. You could argue that a subset of niche features can only be used in Acrobat Reader, but AR is far from the only PDF reader out there. And the churn is part of the zeitgeist, not really a responsibility of anyone in particular. Individuals are suckered into it, companies are supplying it, and governments are allowing it. We're all part of it. Not new either: I'm hearing it since the 90s how the modern life is rushed, and that's just my limited experience. reply

lucideer 1 day ago [–]

I said it wasn't universal, which is somewhat different to the vague idea of being "open", and yes, PDF is technically an "open format" depending on how you define "open". The ISO 32000 spec. costs in the region of ₂₀₀ USD/EUR. What that "openness" translates into in the real world is that there are zero non-Adobe viewers that support all of PDF's features, and even less PDF editors. The standard PDF editor costs ₂₀₀ USD/EUR (annual subscription). This is before we even get into the nightmarish world of PDF parsing. Or PDF accessibility. PDF is a great format if you're sending a document to someone for them to print immediately. It has no other valid uses imo. reply

npteljes 1 day ago [–]

I see your point, thanks for the elaboration. I'm not a fan of the format either. reply

ChrisMarshallNY 2 days ago [–]

PDFs are...not so easy to generate dynamically. I have done it with a couple of PHP libraries (fpdf and mpdf), but they are primitive, compared to desktop PDF generators. I know that you can use Java (never done that), or even...ugh...XSL (also never done that). reply

dredmorbius 1 day ago [–]

Most desktop operating systems offer a print-to-PDF functionality. It's long been an add-on for Microsoft, but that's really a historical accident / deliberate choice of that platform. PDFs can be trivially created from Markdown or using LaTeX templates if you're looking for a programmatic solution. Pandox and XeLaTex are helpful, the poppler libraries as well. Again, these are generally and widely available at no charge. reply

npteljes 1 day ago [–]

Not sure about your points. Contrast it with a static HTML+CSS website:

PDFs require a reader, HTML a browser. I wouldn't argue that there are more PDF readers installed than browsers.
Downloaded static HTML works the same
File size can be included in the HTTP response: in the Content-Length header
Printing is nice, but reflowable text is even nicer, since we target a multitude of rendering targets. reply

foxes 2 days ago [–]

How is a complicated binary better than a literal text file? Truly absurd, this whole thread is churn. reply

lmm 1 day ago [–]

I set up my blog so that the page source would consist of the original markdown and as little markup as possible to make that render. You can read it with telnet and the experience isn't so much worse than using a browser. (The actual part that makes this work is a pile of opaque javascript doing all sorts of nasty things at runtime, but such is the way of web pages in today's browsers, I don't worry too much about it). reply

ergot_vacation 1 day ago [–]

The sad thing is, this is what the web was SUPPOSED to be, more or less: a series of static documents, text and images. The only interactivity (setting aside the occasional CGI forms) was that you could click certain images or text and go to other static documents. Documents linked to documents. Then everyone lost their minds and decided webpages needed to be PROGRAMS and we've been paying the price ever since. reply

jedimastert 1 day ago [–]

I find this to be a super interesting response. When I settled into my current website design, I ended up basically writing an article for the homepage. I'm not a designer by any stretch, and it was the most attractive homepage I could make, and I still really like it. I used a very similar workflow (and continue to for articles) to the papers I wrote in college, and would really only take one more step to get that to final pdf state. I'm torn between leaning into the static nature of the site and implementing the wiki I've been thinking about making reply

101008 2 days ago [–]

I've been doing something similar for 4 years now. I converted my niche website into a monthly magazine, that is released as a PDF (and also uploaded to Issuu). It has its good sides and bad sides. People will download the PDF every month when there is a new issue, but you don't know if they read it, how much time they spend on it, etc. You won't appear on Google Results as you would do if you posted the articles as HTML, etc. Based on my experience, I just keep doing it as an experiment and because I enjoy saying I run a digital magazine, but the true is that there is no real advantages on it. reply

jl6 1 day ago [–]

you don't know if they read it, how much time they spend on it, etc. This is an excellent feature, for the user. reply

101008 1 day ago [–]

Yes, for the user it has some advantages: . Download it and keep it forever. . Read offline. . Be able to share it through email, etc . Print it and read it in a nice place! (I encourage this) Of course, it has some downsides: . No responsive, so people who download it from a phone may hate it. . No accesibility. reply

schipplock 2 days ago [–]

The text is too small to read on my phone. I can zoom in, but then I have to scroll horizontally. I’m afraid this website isn’t targetting me. reply

leephillips 1 day ago [–]

We already have a wildly popular website where all the main content is in the form of PDFs. It’s https://arxiv.org/. PDF is what you use when your document needs to have a predictable layout. This is especially important if it contains math, complex tables, or any elements where meaning is carried by positioning on the page. This can include aesthetic meaning, as in some forms of poetry that need to be laid out in a particular way. reply

dredmorbius 1 day ago [–]

There are several which at least strongly resemble that remark. Project Gutenberg and the Internet Archive's text archives (along with numerous other document-oriented sites, several of the samizdat variety) offer content in PDF and other document-oriented offline downloadable forats. Wikipedia has a "save to PDF" link on each article (that seems to work through the browser's capabilities, if any, not all browsers support this). The sister Mediawiki site Wikisource offers ePub downloads. For longer-form content, PDF, DJVU, and a handful of other formats (arguably ePub) are at least reasonably popular. reply

Symbiote 2 days ago [–]

PDFs used to be unreadable on small screens, but now you can reflowthem. (Pasted verbatim, retaining the missing space.) I don't see this feature in Firefox's viewer, or the default Android one. Can anyone recommend a FOSS PDF viewer that has it? (It must be FOSS, otherwise the point about using PDF to avoid tracking is lost.) reply

nulbyte 2 days ago [–]

Book Reader can reflow PDFs. It is very simple,, which I like. But it adds any PDF you open to the library when you open the app, which I find only slightly annoying for non-books. reply

dalbasal 2 days ago [–]

I found "PDFs are files" kind of compelling. Perhaps this was a flaw of the original www concept. Web pages were always technically files & documents, but this was always abstracted away from userland. "Save webpage" was never a core feature. This did disempower users. PDFs are downloaded, saved, emailed around. They can also be linked to. Userland maintains a closer relationship with what's going on. A typical user know that you can have a copy of a file, which may or may not be identical to the online one. WWW, from its initial version, was mysterious. The transition between the model of requesting files from a server by clicking a link to a programmatically generated stream of code executed on your browser happened below typical users perspective. The wb has obviously gained a lot, but has also lost something. reply

BenjiWiebe 1 day ago [–]

I've definitely used saved webpages a lot. When we had dialup email only, my dad would drive to the library with a flash drive and download Web pages to bring home and read. It was great. Of course, it's even greater now that I can load it fresh even faster. reply

emptyfile 2 days ago [–]

Instead of writing text let me make some more noise by shoving PDFs for no reason. reply

yesenadam 1 day ago [–]

This was a great read. I'm sympathetic! I've had a website (Wordpress) for almost 10 years, but have stopped adding stuff to it lately, because I'm sick of the formatting changing on pages! I look again at a page that used to look great, now the vertical spacing is wrong, or tables have gone out of shape, or the font has changed to something awful. Maybe it's wordpress, maybe it's my bad css/html skills, maybe something else, not sure. I picked up LaTeX skills about 5 years ago and have just been making lovely PDF books of everything I'm into. And they stay just the way I made them. Kind of a shame though, no-one else gets to see them. Yet. reply

Santosh83 2 days ago [–]

Why not just publish static HTML with CSS only? It is, to my mind, better and more accessible than either PDF or a Javascript SPA. reply

TheCoelacanth 1 day ago [–]

And if you bundle that HTML and CSS as an EPUB, it's just as self-contained as a PDF. reply

the_other 2 days ago [–]

If you don’t want churn, don’t churn. PDF is not a web format and you’re wasting effort trying to shoehorn print content and a print format for display on the web. Just use HTML and don’t update it, it’s probably easier. reply

massysett 2 days ago [–]

It's pretty amazing that the basic HTML that I learned 20 years ago still works - it even displays fine on devices like tablets and phones that did not even exist 20 years ago. I understand the author's sentiment but PDF is an overreaction. Just write static boring HTML. reply

account42 1 day ago [–]

it even displays fine on devices like tablets and phones that did not even exist 20 years ago It would display perfectly if mobile browsers didn't have broken defaults (to work around broken websites) that you need to disable using . reply

cxr 2 days ago [–]

Indeed, there's a lot of irony packed into the first page: Featured is a quote from LWN indicting the "software industry" and its "brittle dependencies". What's ironic about this? It's squarely about the parts of the software industry that deal in things that are not meant to be painted in the browser. If you want a solution to the (perceived) churn, it's funnily enough right in the quote from Mark Pilgrim: "I've migrated to HTML 4". HTML is almost certainly not going to end up drifting in such a way that DJB's qhasm bibliography page[1] is ever going to break. HTML and the Web standards in general are, with extremely rare exceptions, cumulative. It's pretty frightening how many technical people don't understand this; the Web is intentionally engineered to serve as "the infrastructure for handling humanity's publishing needs indefinitely"[2]. More frightening is that the biggest threat to this are people like the author here who treat the Web as if it's like any other thing that the computing industry puts out—i.e., already perennially broken. This is dangerous because it anachronistically cedes power to folks who'd try to argue at some point in the future that the things about the Web that they'd like to break (and might be in a position to break e.g. due to browser monopoly) are justified and no big deal, really. The author goes on to call out the Web ("of rubbish") as "user-hostile". Shortly afterward, he or she writes that "PDF makes a stand against the churn". More accurately, PDF makes a stand against the user, by prioritizing authors' creative whims over the reader's needs. This happens again later in their remarks about PDFs being page-oriented: "you are fundamentally not in control of the reading experience." The "you" here is not you, the actual reader. The control they refer to is, once again, the author's. You get other poor arguments—that PDFs are "offlineable" "files" that can be distributed "decentralized", none of which are accurate criticisms against what HTML lacks—unless those Java documentation zipballs that seemingly every university student enrolled in a CS program in the early 2000s was made to download are a collective hallucination. And it gets worse from there. Cute stunt to grab attention and all, but the arguments are fundamentally bankrupt.

http://cr.yp.to/qhasm/literature.html
https://news.ycombinator.com/item?id=27368632 reply

the_other 1 day ago [–]

Thank you for this detailed response!! reply

nonameiguess 1 day ago [–]

It's not a browser format (though browsers can render it), but that isn't the same not being a web format. The web is just the ability to retrieve files from other people's servers, that may themselves reference other files on yet other people's servers. As long as a file format supports hyperlinks, then it's suitable for the web. If you don't care about being able to actually click the hyperlink to activate your desktop system's uri schema handler, then even plain text works fine. reply

austincheney 2 days ago [–]

That’s a hard sell. The churn exists because people want it, not end users, but people who are paid to produce websites. Most churn comes in two flavors:

analytics and spyware
convenience code for insecure developers reply

silon42 2 days ago [–]

EPUB? reply

jacobmischka 2 days ago [–]

Which is just basic HTML and CSS itself. reply

guywhocodes 2 days ago [–]

Yeah but it's a decent subset. Most of the complaints of the author should be significantly better reply

jacobmischka 1 day ago [–]

It would be better if they just used that subset and just published it directly instead of needlessly repackaging it, but if that's what was meant then sure. Maybe we need a better name for simple, semantic HTML and basic CSS. reply

Finnucane 1 day ago [–]

The point of it is to be a self-contained package. You still need hardware to read it, but not a server. In theory at least, once you have it, it's yours. (of course the commerical ebook vendors are trying to spoil that.) reply

goodpoint 1 day ago [–]

No, it still supports plenty of trackers/spyware and so on. reply

mojuba 1 day ago [–]

EPUB is an under-appreciated format that I think can serve as a short to mid-term storage for human knowledge. Can reasonably re-flow itself when necessary, no language run-time required, just a full Unicode support at least at the level of the time the file was published. That's the Internet of knowledge I'd love to see: things organized in EPUB's, searchable and downloadable. reply

qwerty456127 1 day ago [–]

PDF is very far from an ideal format for the today world of different-sized screens. It is a horrible experience on mobile and even worse on eInk pocket books. I would rather advocate making everything available in ePub. Or even better - FB2, it is an easy to grok/implement (designed with manual authoring, simple scripted processing and low-end devices in mind) single-xml structure decoupling the content from the view even more. I often convert ePubs to FB2 (with Pandoc and Calibre) to make PocketBook render them in its native fonts (which always are better) rather than in the font specified in the ePub. I would also mention that the text within PDFs often is not machine-readable (you copy-paste it and get text without spaces, with additional spaces or complete garbage) but I believe this is easily avoidable if you bake PDFs a proper way. I could also suggest publishing everything in Markdown (with images embedded in a Base64 section in the bottom) but this doesn't seem practical because browsers, book-reading apps and eInk devices don't support nice rendering of them directly.

“But how can I implement shiny whizz-bang features that will engage readers and drive conversions?!” You can’t. PDF is boring It's not. It supports JavaScript, embedded video and other kinds of active content. Sadly. reply

kerryoco 22 hours ago [–]

PDFs are files. We must not lose sight of the fact that files are a basic freedom. This seems like the core belief of the article. And it's at odds with the nature of the web. In the beginning, the web was a network of devices transmitting files with addressable locations on the device, creating a more or less 1:1 relationship between the devices and the web - the devices WERE the web. But this inevitably faded as information wants to be... fast and it became easier to whip small data packets around describing state, not files. I agree with the Unixy belief - files are freedom. But trying to model the entire web on those files is fighting gravity. They're not going anywhere. They just have to travel through the Web Soup sometimes now. All the technologies enabling a global network of file sharing are still there, the author is just bemoaning today's lingua franca. (json?) And perhaps there is a fear that we will lose sight of "device-based computing" / file ownership. It has political overtones too... individualism vs collectivism. The web is a very interesting place to hash through those ideas in code before we hash through them in legislation. reply

msoad 1 day ago [–]

Company's S-1 documents are shared on Hacker News. SEC publishes them in both PDF and HTML. Guess which one works better? It's not the fault of HTML standard if people are using React plus 20 different libraries for a simple static content reply

8note 1 day ago [–]

"PDFs are self contained, and can't be broken by an API going down" Is directly broken by "PDFs are part of the web, and part of the content can be by reference to a webpage" If that webpage goes down, that link it broken. That decentralized bit still needs to conform to broken copyright laws too. You can't just download a pdf then rehost it on your own without a license to do so .... There's also a big difference between a city and the modern web. We own the infrastructure in a city, vs rich people own it on the web. Rather than a city, the web is more like a company town. I don't think that's any different for pdfs either. The distribution is still coming from a web server owned by a company -- the real response is self hosting of your stuff, and self hosting by your friends for their stuff. The file format doesn't make it self hosted reply

clearing 1 day ago [–]

I honestly can't believe all the praise for HTML and web on HN in the face of this awesome critique. I hugely appreciate the love for actual files.

• PDFs are decentralised. You may have obtained this PDF from a website, or maybe not! Self-contained static files are liberating! They stand alone and are not dependent on being hosted by any particular web server or under any particular domain. You can publish to one host or to a thousand hosts or to none, and it maintains its identity through content-addressing, not through the blessing of a distributor. This seems to have gotten lost in the offense everyone has taken over the choice to not use 'simple HTML', despite the document's clear reasoning that to do even that would embed the content deep in the 'urban web'. All of these simple-complex propositions about making some subset language or automating document flows are missing the point entirely. reply

danShumway 1 day ago [–]

You can publish to one host or to a thousand hosts or to none, and it maintains its identity through content-addressing, not through the blessing of a distributor. It kind of seems like you're describing IPFS, except with worse content addressing guarantees. The vast majority of your users will never check to see if a PDF's content actually match its content address. All of these simple-complex propositions about making some subset language or automating document flows are missing the point entirely. Are they? It's really not that hard to build a self-contained HTML file, and to re-emphasize, signed PDFs and signed HTML files are about the same level of accessibility to most users. Web browsers don't really handle either, if you want those guarantees you need to use a protocol/technology with better support right from the start. Also to be clear, despite the author's argument that PDFs can be self-contained, no browser guarantees that, and there's no way for me to tell if the PDF is self contained when I click on it in Firefox unless I download it and check it myself offline or in a viewer that guarantees it won't make network requests. Nothing online that I'm aware of forces authors to use PDF/A, so when I download a PDF, I don't know what I'm getting. It's not actually the magical, re-hostable world that the author claims. I'm not sure that people are missing the author's point so much as they're saying the author is making claims about the portability of PDFs that aren't necessarily accurate. Yes, it would be good to have better self-contained guarantees about some web-content, but I'm not sure PDFs actually supply any of those guarantees. reply

sammalloy 2 days ago [–]

One problem I noticed on mobile, is that if I click on a link in the PDF and visit another page, and then try to traverse back, it takes me to the first page in the PDF, rather than the page I linked from. reply

throwawayswede 2 days ago [–]

While I appreciate the sentiment, I don't think PDF is the way, at least in the way you're currently doing it. PDF maybe supported by browsers, but they're not intended for it, it's secondary feature. Same for search engines. Same for mobile. Most browsers have Print to PDF. If you want people to be able to download an immutable version of your content, then just have a simple static version of your page with a valid print css, better yet, leave everything default. If you want to fight churn with PDF, just have a simple HTML website with a link to download a versioned PDF of your issue. Your website can be as simple as https://motherfuckingwebsite.com/ or https://bettermotherfuckingwebsite.com. reply

grumblenum 1 day ago [–]

There are also other lightweight alternatives. The Gopher protocol has a small, but disturbed following : http://gopher.muffinlabs.com/gopher.floodgap.com (you can actually use netcat as your gopher client). Gemini is a more modern gopher-inspired protocol https://gemini.circumlunar.space/. Personally, I'd be pleased to see a text-first approach gain adoption. I don't think anyone looks at the thick-client model browsers have evolved into and sees an optimal solution. I think evangelistic energy should probably be directed at complaining to organizations that share content through JS-framework monstrosities. Getting rank-and-file web-devs excited about lean websites doesn't hurt, but clients and CTOs have real decision making power. reply

justanotherguy0 2 days ago [–]

Not optimized for mobile so I didn't read much and bounced. reply

PretzelPirate 1 day ago [–]

I read it on my phone. I then clicked an external link at the end and then hit my browser back button. I had to wait for the PDF to re-load and was unhappy when I found myself back at the top of the document. I would get a much better experience with html. reply

vimy 2 days ago [–]

Reading PDFs on a phone isn’t an enjoyable experience. reply

deregulateMed 2 days ago [–]

For books, I prefer it to Libby and Google Books. There are tons of pdf viewers to choose from, so if you don't like an App, there are more available. I like that mine remembers the last opened doc and page. I can copy text from pdf too. Although this isn't a comparison of ebook to pdf, it's html to pdf. reply

janandonly 1 day ago [–]

PDF-fing everything on your website is one way to go about it... I personally use the service at printfriendly [1] and Arc90's Readability to make un-crufted and readable PDF files of web content that is worth saving for the coming decades. Added bonus: by saving these very small files on my system pressing the Command + Spacebar on my system I can easily search through my multiple decades of interesting files... [1] https://www.printfriendly.com [2] https://ejucovy.github.io/readability/ reply

croes 1 day ago [–]

Now fight https://www.nngroup.com/articles/pdf-unfit-for-human-consump... reply

fsiefken 2 days ago [–]

Very good, I go for project gemini https://gemini.circumlunar.space/docs/faq.gmi reply

leephillips 1 day ago [–]

There are good points here, but I think the author slightly undermines his message because the layout and typography of this particular PDF is so poor. Probably because it “was written in the world’s greatest web authoring tool: LibreOffice Writer”. In other words, one advantage of PDF is that free authoring tools such as the TeX family can create typographically beautiful results that are nearly impossible to achieve with HTML, but he leaves that on the table. reply

uncomputation 1 day ago [–]

I cannot tell if this is satirical or not. Assuming it is not, every single “pro” of PDFs is just plain incorrect except for the one about being “self-contained” to which I point to https://gwern.net as a good example of self-contained HTML. Gwern archives all the pages he references so that they are always available. In the case this is satire, I applaud it because I did get a few chuckles. reply

bittercynic 1 day ago [–]

In the words of the great Ivan Stang: "I'm joking AND I'm serious!" *I'm not the author, just thought the sentiment from that quote applied here. reply

croes 1 day ago [–]

Useless rant. His choice won't change the rest of the internet and for his site he could easily write lean html without all the stuff he complains about. reply

rerx 1 day ago [–]

When I click on the submitted link with Chrome on Android, it asks me if I want to redownload "0.pdf". Such a confusing question. If I pick the wrong answer, I end up with some restaurant menu I must have looked at months ago, not what the global poster intended. So for non-confusing real-world UX I'd recommend extra care with file names if you want to go PDF only. reply

Aeolun 2 days ago [–]

All of the stuff he says PDF is, is the same for HTML. reply

zeusk 2 days ago [–]

Well, sort of. Can't HTML contain script tags with external references (xmlHttpRequest or any async fetch) that a simple crawler/browser may not save to disk? reply

wccrawford 2 days ago [–]

They could, but if he's the one create the file, he can choose. And if he's just hosting the file, I'm sure there are tools that will inline all the external resources. reply

wffurr 2 days ago [–]

It can, but you don’t have to. It’s absolutely possible to write self contained html files. reply

eaton 1 day ago [–]

The whole post boils down to: "HTML is bad because it has scope creep and people use it for bad things, but PDF is good because I made this particular document in a way I like for a use case I prefer." You do you, man! Some people run Archie servers, some people create a directory full of PDFs. reply

stayux 2 days ago [–]

Thanks. I am starting self-hosted blog about design fundamentals, best-practices, etc. Using only PDF is not a solution for me. Combining minimalistic web-site design with pdf/e-pub will suit me well. I like your approach as a statement against web "pollution". reply

mattnewton 1 day ago [–]

I can’t tell if this is satire or not, because reading it on my phone hurt my eyes after the first couple pages. Please use EPub if you are after an open format or freeze web pages into an offline-able format and don’t use PDF. reply

dvfjsdhgfv 2 days ago [–]

I don't agree with author's choices (yes, I'm disciplined enough not to add irrelevant elements to my content), but it's really sad that things got to the point where someone actually suggests PDF as an alternative to the web. reply

ColinWright 1 day ago [–]

For reference, the original title was: We are drowning in churn and noise. I am fighting by switching this site to PDF I find the "actual" title unhelpful, unenlightening, uninformative, and uninviting, which I why I originally chose text taken directly from the page, so people would know what it was about before taking the time to click and read. I know why the HN mods have changed it to "Deurbanising the Web", but I wish they'd keep more informative titles, especially when taken from the article in question. reply

failwhaleshark 1 day ago [dead] [–]

I didn't understand what it meant. I thought it was a euphemism for gentrification. None of their knee-jerk, dilettante, low-effort rant clearly identifies what they're really mad about, or if they're just mad to be mad. It feels like a waste of everyone's time.

shortformblog 2 days ago [–]

Even though Jakob Nielsen is very much still alive, he’s rolling in his grave. reply

westcort 1 day ago [–]

While I agree with the thesis, I believe it it possible to do things like this with vanilla HTML. For example, I created a search engine that is just a static HTML page: www.locserendipity.com reply

bambax 2 days ago [–]

He should offer PDF in addition to basic HTML, not as a replacement. reply

temporallobe 1 day ago [–]

Why not just extremely simple, plain HTML? No frameworks, not even CSS. In fact, you could make your life even simpler by using markdown files and having the browser convert that to HTML in real time with a single JS library (there are a few, I am not promoting anything one particular), so it doesn’t even require a “back end”! Plain HTML, while not having all the “portable” attributes of PDF, is still pretty darn robust and most browsers handle printing (or conversion to PDF) quite well. reply

dredmorbius 1 day ago [–]

Some of the listed benefits don't apply. Notably paginated (PDF) vs. scrolled navigation, but also features such as formulae displays and specific typesetting / layout elements, in-page bookmarks, highlighting, and notes. For shorter documents that's not much of a problem. For anything much over _chapter length (about 20 pages or 10,000 words), navigation within a single HTML page becomes painful. Well below that level on smaller devices reply

prox 1 day ago [–]

I think it is because PDF is a document first, and HTML often hard to save/file. PDF is also able to create with design in mind, in a document creation app, which after decades of HTML is still hard to do I think. reply

BenjiWiebe 1 day ago [–]

HTML isn't hard to save and file on a computer, and on phones it seems everything is hard to save and file. reply

prox 1 day ago [–]

You are right in a technical sense, but if I ask someone who is a low level user to save a webpage, most don’t know how to do that. It’s not front and center or even encouraged! This makes a big difference for adaption. reply

eloisius 2 days ago [–]

I'm not old enough to remember Gopher being "the internet" but I have browsed a few retro sites that still run it. I wouldn't mind seeing some slightly upgraded gopher-like protocol that allowed for embedding images and maybe form submissions (without any scripting). Most of what I want to do online is read, and I'd be more than happy for everything to come with a standardized look and feel rather than whatever scroll jacking weirdo design every website feels like having. reply

sbazerque 1 day ago [–]

I like the idea of keeping HTML's document-centric original design, but accessing the documents using p2p protocols (instead of the client-server model used on the web). I'm working on an open-source implementation of this idea at https://www.hyperhyperspace.org reply

saint-loup 1 day ago [–]

This experiment is interesting, but not so bold or novel when you consider the culture around making zines (small, DIY, often quirky magazines). The creativity there is amazing and medium-wise it's often "hybrid" (print-oriented but shared online). For instance there's this tool to help creating zines. https://alienmelon.itch.io/electric-zine-maker reply

pseingatl 1 day ago [–]

Most people think that pdf's have to be letter or A4 size, but you can make them at A7 or A8 for a phone screen, or for that matter, any size you want. PDF is size-agnostic. There's nothing to stop you from creating documents the size of a phone screen. So you could put the phone screen-sized pdf at m.mysite.com and this small screen illegible complaint is solved. reply

dredmorbius 1 day ago [–]

The site would be inspired to automatically detect device sizes (JS or CSS media queries) and offer an appropriately-scaled PDF download option. Unfortunatly it didn't opt for that. reply

apotheon 1 day ago [–]

Why does it seem like almost everyone doesn't realize that PDFs can easily be made to support all the horrors we see in HTML? No, it's fucking well not impossible -- or even notably difficult -- to jam some malicious dynamic code into a PDF. The only reason a period of widespread fear about PDF viruses hasn't developed as it has for websites spreading malicious code is the fact that websites got much more widely adopted. PDFs have been used as malicious code vectors before, and replacing HTML with PDFs would only result in PDFs being the new common vector for the same problem, with at least the same scale and intensity. This only seems like a solution if you don't know what PDFs can do -- and, by the way, sometimes pagination is bad, especially static (non-reflow) pagination. EDIT: Let's make this clearer. You can actually embed an entire JavaScript application in a PDF. Tell me again how PDFs somehow prevent the problem of dynamic pages on the web. All using PDFs instead of HTML pages would do is wrap the horrors of the web in forms that are generally more hostile to various viewing contexts for the less harmful use cases (e.g. static pages suddenly being harder to read in some contexts with PDFs than with HTML pages). reply

X6S1x6Okd1st 1 day ago [–]

"I'm mad as hell and I'm not gonna take it any more" but for webtech. It's totally unclear why they don't just use a subset reply

LightG 2 days ago [–]

Appreciated the sentiment of it. It's not ideal, but in a non-ideal world where the big boys have ruined the web, I tip my hat to this effort with a large dose of empathy. Cheers, reply

ccorcos 1 day ago [–]

"Files are a basic human freedom" - that definitely resonates with me. There's an assortment of trade-offs though. In particular, linking between files breaks if you ever want to move or rename a file. Also, by self-encapsulating every file, you end up using space less efficiently. reply

SethMurphy 2 days ago [–]

Naming or framing things in a difficult or obtuse way can be a good way to limit your audience. However, if it works others will follow and it will no longer be effective. I had a similar experience with a Meetup I once hosted which I specifically put in a location that was difficult (but admittedly becoming trendy). It worked for a bit but eventually attracted the crowd I was trying to alienate. reply

cerved 2 days ago [–]

this is a joke right? reply

jacamera 2 days ago [–]

Yes. Though I think the real question is whether or not it was intentional. reply

xvector 2 days ago [–]

I think simple HTML + print to PDF (supported by default in most browsers) is a much more elegant solution. reply

opsecweather 2 days ago [–]

Run it through outline.com first to remove all the ad-sidebars. reply

divbzero 1 day ago [–]

I like the spirit of this but would prefer text or static HTML over PDF as choice of file format. reply

9876545678 1 day ago [–]

Comments here are disappointing. The problem with any of this is getting any momentum, so given the level of pushback, pdf might not be it. Having to be a specific version of pdf probably doesn't help. Creating new spec is hopeless as well unless you are someone very famous and can manage to get enough people to adopt. There's text/markdown mediatype which can also serve this purpose but it needs a boost from someone with some street cred. People work in predictable ways and this is a political project. https://datatracker.ietf.org/doc/html/rfc7763 reply

zabzonk 2 days ago [–]

Sorry, I'm not a Web developer - what is meant by "churn and noise" in this context? reply

kissgyorgy 2 days ago [–]

It's a terrible "implementation", but interesting observations we should consider. reply

marbu 1 day ago [–]

I don't consider using pdf for this purpose a good idea. It would be better to have a static html pages, with reference to epub with the same content. One can have both generated from the same source with a static site generator. reply

pharke 1 day ago [–]

Isn't this what IPFS is for? reply

knownjorbist 1 day ago [–]

I'm surprised that IPFS and others aren't mentioned more here. The solution is staring us in the face, it's related to cryptography. reply

BaldricksGhost 1 day ago [–]

How about plain old HTML? Might not be as pretty but it sure beats a bloated PDF. reply

npteljes 1 day ago [–]

It also wouldn't be upvoted on HN. I agree that a static page generator would have been a much more fitting technology (for example). But sometimes you gotta sacrifice that for visibility. reply

afavour 1 day ago [–]

Didn’t expect I’d see a top post on HN defending the page-centric nature of PDFs. A pager format is awful for anything other than printing out pages. But hey, it’s a big wide web, you do you. But I won’t be reading. reply

maccard 1 day ago [–]

I've always wondered why some sites can serve PDFs that my browser (firefox) can view inline (my preferred method), rather than forcing me to download the file and open in a separate application reply

chrismorgan 1 day ago [–]

It depends on the Content-Disposition header: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Co.... There are extensions that let you intercept this header, e.g. https://addons.mozilla.org/en-GB/firefox/addon/no-pdf-downlo... which per https://github.com/MorbZ/no-pdf-download/blob/c924d657f33398... detects the content-type and if it’s PDFy replaces the content-disposition header with “inline”. (Clicking on a link that has the download attribute set also affects things: https://developer.mozilla.org/en-US/docs/Web/API/HTMLAnchorE....) reply

maccard 1 day ago [–]

Today I am one of the lucky 10,000 0. I learned about a new header, and fixed an issue I've had for years 0 https://xkcd.com/1053/ reply

agomez314 2 days ago [–]

Th author has a point in that many people want an online presence but the way the imagine it is more akin to a pamphlet or poster than a hyperlinked website. If that is the case, then pdf or a resizable image makes sense. reply

leephillips 1 day ago [–]

A related idea is making a website entirely from SVG. Here is a lovely example: https://ozake.com/ reply

chrismorgan 1 day ago [–]

One previous discussion in comments: https://news.ycombinator.com/item?id=24257982 For my part, I expressed bafflement because the end result seems worse than the starting point in almost every way, including those that the author was complaining about the web for. (There are a couple of others to be found in https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que..., but not so substantial.) reply

snksnk 2 days ago [–]

Why not use TeX/LaTeX instead and also include a link to the code? reply

jimhefferon 2 days ago [–]

The LaTeX below will leave a push-pin symbol in the text, and clicking on it shows the code. \documentclass{article} \usepackage{attachfile} \usepackage{lipsum}

\begin{document}\expandafter\attachfile\expandafter{\jobname.tex} \lipsum[1-150] \end{document} reply

leephillips 1 day ago [–]

Using xelatex, I got only the text, no pushbutton. Using pdflatex, I got a pushbutton, but it was not a hyperlink, just an image. What engine do you use to get this to work? reply

jimhefferon 1 day ago [–]

I ran pdflatex from a 2017 TeX Live install under Ubuntu, and viewed in Acrobat Reader. reply

leephillips 1 day ago [–]

Ahh, I think it is a viewer problem. Sadly, most viewers can not handle PDF attachments properly or at all. reply

gtirloni 1 day ago [–]

I can't read this in my phone. There's no automatic layout and the fonts are too small. Zooming in works but it's a nightmare to navigate. This is an accessibility disaster. reply

0xcoffee 2 days ago [–]

Excellent! Excited to see the next PDF generator framework. reply

rado 2 days ago [–]

This is terrible for accessibility. Please just use semantic HTML and your web will be usable on 10yo devices and unknown devices 10 years in the future. reply

csomar 2 days ago [–]

It's ironical that the author is pitching for PDF, and yet he is using a plethora of hyper-links. The big "invention" of the Web was linking pages together. That's what made it great. That's what created "Google" in the first place. Links in a PDF are supposed to take you to a browser or open a different PDF file? PDF is a step back. If you are angry about the overblown size of JavaScript and resources consumption, use a simple static website. It doesn't get easier than that. reply

alisonkisk 1 day ago [–]

You're conflating browsers with markup language. Clicking an HTML link opens a different HTML file. reply

saltdoo 1 day ago [–]

I'm on mobile and unable to open any links in this pdf after opening with three different pdf viewing apps. :/ reply

GrumpyNl 2 days ago [–]

Instead of pdf, why not the most basic HTML? reply

specialist 1 day ago [–]

OC page three:

PDFs used to be unreadable on small screens, but now you can reflow them. Off hand, which PDF viewers do reflow? reply

dredmorbius 1 day ago [–]

Foxit, PocketBook Reader, FBReader. Presumably Adobe Acrobat though I've not touched that in a decade or more. There are also utlities such as the poppler library's pdftotext which will dump ASCII / bare text from at least some PDFs. reply

SynapsePixels 1 day ago [–]

PDFs used to be inaccessible, but now you can tag them. This PDF is not accessibility compliant. reply

richardwhiuk 1 day ago [–]

You can server side render PDFs and make them dynamic if you wanted to. reply

halayli 2 days ago [–]

pdf a major attack surface too. reply

danShumway 1 day ago [–]

I guess by modern standards this load time is acceptable, but when you argue that PDFs are a way to move forward, you're competing with HTML 4/5. And by that standard:

Crud this website is so slow. Unacceptably slow. If your technology stack is spending 10 seconds just to fetch and render 13 pages of large-screen text, then either you're doing something wrong or it's a bad technology stack. That load time alone should kill this idea.
There's no way for me to turn off images. This is the opposite of a client-respecting webpage, the only way you could make it worse is by rendering to Canvas or shipping me a PNG. My mobile browser doesn't fetch fonts by default. You're overriding my choice to do that.
Mobile? Reflow? Responsive design? Adjustable font sizes? The author kind of offhandedly says that PDFs can do reflow right now, but how many clients actually support that. Does the PDF format handle this by default?
Saying "you can technically make PDF accessible" is exactly the same as saying "you can technically use just a subset of HTML." It's the same argument. Nobody does it, PDFs are generally hostile to accessibility, and there's no way to signal that a PDF is accessible or enforce it as a community standard. So, the much bigger question: what's wrong with Gemini0? I've been critical of Gemini in the past on multiple fronts, but if you are in this space where you want to burn everything down and make your blog static, Gemini really does seem to solve every problem that the author has, except better. It's also trivial to proxy Gemini documents or statically re-render them to HTML, which makes them accessible to people outside the community. And by default, they're both pretty accessible to screen readers, and much more efficient than what the author is proposing. The author argues that using static HTML wouldn't be good enough because there's no standard that forces you to exclude Javascript. Then they point to PDF/A, which is not a standard that is enforced by most browser PDF viewers. To me, this argument isn't any different from telling website authors to choose not to use Javascript, what is going to force anyone to use PDF/A? Every web browser PDF reader supports Javascript. NoScript support in Firefox is better than the controls/extensions for disabling PDF scripting. And Gemini is right there: for the most part it's actually working today. So I just don't get it. Why pick a technology that's tangibly worse than the web on (and I mean this quite literally) almost every single axis and every single metric, when you could instead switch to a markup language that actually does have use-cases, that does simplify deployment and blogging in some instances, that does have a real community, that does have some real advantages over HTML, that does have some real momentum behind it, and that doesn't disrespect my choices about what fonts/images I want to download?

dahfizz 1 day ago [–]

If this catches on, there will be "JS in PDF" in no time. reply

MawKKe 1 day ago [–]

as in "it exists already"? reply

admax88q 1 day ago [–]

Well this sucked to read on mobile. I'll stick to HTML. reply

ok123456 1 day ago [–]

Jekyll plugin that produces a pdf version of each page? reply

JorgeGT 2 days ago [–]

While this may be extreme, I do notice that it is becoming harder and harder to print webpages to PDF/paper. Is there a good approach for this besides the standard print dialog? reply

bigyikes 2 days ago [–]

For sites without print-specific media queries (so basically all websites) I use dev tools to delete all the DOM nodes I don’t want to appear in print. reply

kuu 2 days ago [–]

Maybe use the read mode of Firefox and then print it? reply

prox 2 days ago [–]

I love the basic idea here. Needs polishing if you want to blow this up to the masses. It’s like my Pi who just does one thing really well, and allows me to tinker on every level if I so choose. reply

prox 2 days ago [–]

I like to add that I think a well designed PDF is just so much better looking than any html based page (and has a lot more freedom) reply

pasc1878 1 day ago [–]

Definitely less freedom. On html I the reader can change the size of text or even the font and the text will reflow so you don't need scroll horizontally to read each line. How do you do that to a pdf? reply

prox 1 day ago [–]

That’s not what I mean (your point has merit) If I ask a designer to design a website, he has to send it of for implementation, or is confined by html breakpoint/accessibility options. PDF can go straight from designer to document and do everything in a program like designer, indesign and so on. It’s a designer first paradigm. reply

PaulHoule 1 day ago [–]

PDF has quite the attack surface. It supports Javascript, 3D models, JBIG2 compression that turns 8's into 6's and all sorts of strange things. reply

SMAAART 2 days ago [–]

Well, that's innovative. but, why not HTML 2? reply

tonis2 1 day ago [–]

What a nice website, what framework is it built with ? Maybe Vue.js or Angular.js or maybe Nuxt fuking js ? reply

api 1 day ago [–]

The point about the size of the W3C spec is hilarious, but I wonder how much of that hundred million plus words is actually necessary to implement the parts of the spec that people use? Surely it would be possible to create a spec that captured the most useful subset of HTML and CSS functionality. In any case if the spec really is that huge the W3C should be written off. Any organization that produces a spec like that is worthless. reply

atemerev 1 day ago [–]

"But stable standards are incredibly important.They allow software, at least in theory, to be finished. Why is it importantthat software be finished? Because it gives us hope that we might end thechurn and fix all the bugs! I want to use software whose version number is7 1.0. I want to use software whose every line of code has been studied,analysed, optimised and punishingly tested. I want every component andsubcomponent and every interaction and every configuration to beexquisitely documented, and taught in courses, and painstakinglydeconstructed and proven sound" Sorry, not possible. Never, ever. Software does not work like that. Bugs will never be fixed (if they could, the software in question would have become obsolete long ago). By the way, this is what you get when you try to copypaste text from this "website". reply

blacktriangle 1 day ago [–]

"HTML’s semantic capabilities were oversold." THANK YOU! HTML semantics are a trap, just enough to make you think something is there but anemic enough to be a giant excersize in bikesheading. Ask yourself this: If HTML semantics were adequate, why do we have ARIA and 90 different microformats? Other than that, I read the article expecting to be annoyed by the PDF presentation but was pleasantly surprised by how it read just like I would want a content page to read. My only complaint is that browsers (at least Brave) do not preserve scroll position in PDFs. If the browsers fix that the author may be onto something here. reply

lucian1900 1 day ago [–]

Sounds to me like ePub would fit better. It’s designed for reflow and it’s built out of a subset of HTML. Worse case the contents of the file can be expanded. reply

solarpunk 1 day ago [–]

it appears techies have discovered zines? reply

dredmorbius 1 day ago [–]

Long sympathetic with the Jacob Nielsen / PDF bad camp ... I've had some recent changes of heart. Not a full convert, but PDF is often superior to HTML, especially for longer-form and complex noninteractive content. Books are an artefact whose design has evolved over the centuries to accommodate human-scale ergonomics: font size, paper and ink colour, words per line, lines per page, pages per volume, overall weight and dimensions. Standard-sized books are all larger than the largest current mobile phones, with diagonal measures of about 9--12 inches. There are smaller and larger books, but those are compromises either to portability (pocketbooks) or to large-format resolution and detail ("coffee table" books, atlases, and the like). Magazines tend to run even larger (about 13"), broadsheet newspapers larger yet. Most criticisms of PDFs are actually criticisms of the devices and displays used to read them. Poor resolution, incorrect aspect ratios, and small display sizes (especially mobile devices) are the key problems. Reading PDFs on a tablet, especially a larger e-ink device, is a game-changer. I now actively avoid HTML, or at least launch it in a browser designed with e-ink in mind (EInkBro: https://github.com/plateaukao/browser). Otherwise, my large (13.3") high-DPI (200+) B&W ebook reader is an excellent long-form immersive reading tool. The key requirement of a mobile phone is that it fit in a pocket, handbag, or purse. They are too small for reading, and aren't designed for that purpose. Current devices feature screen sizes of roughly 5--7 inches (diagonal measurement). At the lower end, that's smaller than a 3x5 index card (6"), and the largest barely the size of a 4x6 card (7"). On desktops, the first display that offered what I felt was a truly comfortable two-pages-up PDF reading experience was the 27" Retina iMac. Its 5K display (itself an oddball size) suits document work well. Even not fully maximised, most books are highly readable (leaving screen space for other tasks), and at full maximisation, details really stand out especially from scans of historical editions. (Such details aren't always relevent, but often are.) PDF also provides capabilities HTML either cannot or does not by default (and few seem to be persuaded to offer), especially pagination, formulae, and a spatially-persistent layout (if you have a spatial memory, this is very valuable). PDFs can though often do not include internal navigation (chapters, sections, etc.), search (if full text is included), and most critically, metadata (at a minimum, author, title, date, and publisher, see the full Dublin Core metadata specification for what should be required). PDFs can also be published directly to device sizes (or to a set of form factors encompassing typical devices), as several others note. Some of the issues aren't entirely intrinsic, and my feeling is that wider use of PDFs for online content would lead to a proliferation of PDF annoyances to match present-day Web annoyances. In each case, the fundamental problem is that publishers rather than readers have final say over presentation. An alternative, of distributing raw minimum markup and formatting that to user specifications following a set number of templates ... might help. It's ironic that the article here embodies a number of PDF annoyances:

The shaded background renders quite poorly on a B&W e-ink reader (though can be eliminated with a watermark-removal setting).
The filename provides no clues as to contents or provenance, and is likely to collide with other content.
I'm a fan of serif fonts, not sans serif, for high-DPI reading.
Internal and external hyperlink support is ... variable. At times utterly missing, at others, inconsistent or inconvenient.
PDFs are not trivially directly editable, which means both authors and readers can change errors or address issues.
Many PDFs lack internal structure, even where the document they encompass do. The number of books lacking PDF table-of-contents support is ... large.
Metadata standards and practices are abysmal. See the Dublin Core standards.
Naming conventions similarly. "Report", "Resume", "Project", or "0.pdf" are names which should never be used. Describe author, content, and date, as a minimum, if possible. reply

DocTomoe 1 day ago [–]

This sounds like the Creative Director I worked with, ca. 1998, who bemoaned that he couldn't have pixel-perfect layouts over a wide variety of devices/browsers/operating systems. reply

6510 1 day ago [–]

the url should end like /Deurbanising-the-Web.pdf so that hitting the safe button doesn't name the file document.pdf or 0.pdf Also this... https://lab6.com/0#%5B%7B%22num%22%3A1%2C%22gen%22%3A0%7D%2C... eh? reply

failwhaleshark 1 day ago [dead] [–]

Cut off your nose to spite your audience. PDF is meant for viewing and printing books. It's not very good for browsing and requires PDF viewers. All of the browser add-ons, functionality, and behaviors are lost by forcing people to use a PDF viewer. HTML is meant primarily for browsing but it can also be used for print media. CSS can specify paper sizes. If someone were so worried about external media, they can host it themselves or roll their own CDN. If they were so worried about fonts, they can include them themselves. It's more semantic web-compatible to describe a website with RDF and have PDF, EPUB, DJVU, MOBI, TXT, PS, etc. links there and also in the webpage. This is how you provide the most accessibility. Furthermore, using a meta document language like LaTeX or something XML that can transform into other document artifact forms mechanically is the way to go.

Ostrogodsky 1 day ago [flagged] [dead] [–]

"And for that reason I am creating a 1 MB behemoth that you need to download to read 3000 words or so."

KEITH_PETERSON 2 days ago [flagged] [dead] [–]

I just opened your website on mobile and it's very user friendly, I got to scroll in many directions to read the content. We build our own website with gatsby and only use js if it's really needed (when you click interactive links, we're still trying to improve a bit. We customized Gatsby because doesn't support this out of the box) that gets 100 score on mobile on Google page speed: https://marxcommunications.com/ Proof: https://imgur.com/a/N4IJoEk Or run it yourself: https://developers.google.com/speed/pagespeed/insights/?url=... It's possible but takes some work.

midrus 2 days ago [–]

LOL reply

everyone 1 day ago [–]

"with no external dependencies to manage." Except for like, the software which reads a

Twitter - Mastodon - Telegram - Local

38.933699∆-92.388632-08172022-182900