Show HN: Make your site’s pages instant in one minute

WestCoastJustin · on Feb 9, 2019

Live demo on my website @ https://sysadmincasts.com/

Temporary added it in-line for testing. I was already in the sub 100ms level but this just puts it over the top! Also updated all admin add/edit/delete/toggle/logout links with "data-no-instant". Pretty easy. Open developer preview and watch the network tab. Pretty neat to watch it prefetch! Thanks for creating this!

ps. Working on adding the license comment. I strip comments at the template parse level (working on that now).

pps. I was using https://developers.google.com/speed/pagespeed/insights/ to debug page speed before. Then working down the list of its suggestions. Scoring 98/100 on mobile and 100/100 on desktop. I ended up inlining some css, most images are converted to base64 and inlined (no extra http calls), heavy cache of db results on the backend, wrote the CMS in Go, using a CDN (with content cache), all to get to sub 100ms page loads. Pretty hilarious when you think about it but it works pretty well.

fwip · on Feb 10, 2019

If I mouse-over the same link 10 times, it looks in my network tab like it downloads the link 10 times.

I'd expect this preload script to remember the pages it's already fetched and not duplicate work unnecessarily. :/

aruggirello · on Feb 10, 2019

Perhaps the author could add a script parameter, or support an optional 'preload-cache-for' attribute, so you'd write <a preload-cache-for="300s" ...>

If you really care about speed anyway, you should already have setup your site to max out caching opportunities (Etag, Last-modified, and replying "not-modified" to "if-modified-since" queries) - I'd suggest the author should ensure the script does support caching to the broadest extent possible - hitting your site whenever appropriate.

dharmab · on Feb 10, 2019

Cache-Control headers already do a better job of solving that problem https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Ca...

ajsharp · on Feb 10, 2019

Set your http cache headers correctly to instruct the client to save and re-use pages it downloads.

WestCoastJustin · on Feb 10, 2019

Yeah, this is likely something I need to look into. Since the site changes quite a bit for, non-logged vs logged, I'm not really doing much html caching right now. I'll check that out though. Even if I added something like 30 second cache TTL that might work. I'd have to see if that is even possible and test though. Maybe force an update on login? I'm not a html cache pro so I'd need to see what the options are and then do some testing. For now, it was just fast enough / not worth the time, to do anything other than a fresh page for each request. But, this is a good suggestion. Thanks.

ps. I do tons of caching for any images and videos though. I know those things never change so have weeks of caching enabled.

threatofrain · on Feb 10, 2019

But how do you know it's still fresh?

Piskvorrr · on Feb 10, 2019

HTTP has multiple cache control headers. It's a fairly complex topic, but TL;DR: do the config and it works in any browser since ~2000 (yes, even IE6).

harrisonjackson · on Feb 9, 2019

Very impressive! Wonder if it'd be worth rerouting some of the paginated URLs like the episodes `?page=4` to `/page-4` or something like that. :shrug: - either way, looks like you had some fun optimizing it!

merkaloid · on Feb 10, 2019

/page/4 would probably be more prudent

harrisonjackson · on Feb 16, 2019

I don't like this as much since a savvy user might then try to navigate to /pages/ - which has what on it?

WestCoastJustin · on Feb 9, 2019

Good idea. I'll check that out! Thanks.

dieulot · on Feb 10, 2019

I just released version 1.1.0 which allows whitelisting specific links with query strings like those. https://instant.page/blacklist

WestCoastJustin · on Feb 10, 2019

Awesome, thank you!

m0zg · on Feb 9, 2019

This is what the OP should have posted. Very impressive. Will use in my work.

ehsankia · on Feb 10, 2019

I like that there's an extra optimization where it cancels the preload if your cursor leaves and the prefetch hasn't finished it. Would help on really slow networks and pages that timeout.

totony · on Feb 9, 2019

nothing on my network tab in ff 65.0

SCdF · on Feb 9, 2019

ditto. It works on Chrome though (sigh)

sneak · on Feb 10, 2019

Why the sigh? It says right on the site that this gracefully degrades on browsers that don’t support it. Why is it a problem making a site faster in a browser designed for speed, if it does not degrade the experience at all in all other browsers?

amenod · on Feb 10, 2019

That is OK, if you are talking about older browsers or browsers on limited / obscure platforms. Firefox doesn't fit the bill, and optimizing the site just for ~~IE~~ Chrome hurts the FF users and makes Chrome win even more - leading to self-fulfilling claim that Chrome is faster. It's not, but it will be if people optimize for it. This is one of the reasons I make a point to always develop in Firefox and later just check in Chrome (note that I still do check, because majority of people use it) - apart from simply better experience of course. :)

calibas · on Feb 10, 2019

Not working for me either, Chrome works great though.

SquareWheel · on Feb 10, 2019

Embedding base64 images isn't really more efficient on HTTP/2 servers. Base64 adds overhead, and multiplexing mitigates the cost of additional network requests.

ponyfleisch · on Feb 10, 2019

But unless you use server push, it still is an additional roundtrip.

dalore · on Feb 13, 2019

Yes but since it's multiplexed into the same tcp stream it doesn't suffer from slow start and so the tcp window is already large so it's not as bad as it would be on http1.

WestCoastJustin · on Feb 11, 2019

UPDATE - Feb 11. I removed the include for now. I was seeing a weird caching issue when people login/logout where the navbar would not update correctly (for the first few requests after). I'm still digging into this. I likely need to invalidate the browser cache some how. Doing some research to see what the options are.

bradknowles · on Feb 11, 2019

Have you signed up for a page speed monitoring service, like https://speedmonitor.io/ ?

I'd be very curious to know what your performance looks like over time, especially as it relates to various improvements that you try out.

bradknowles · on Feb 12, 2019

Or maybe Pingdom, NewRelic, or gtmetrix?

ksec · on Feb 10, 2019

I wonder if breaking the page into two parts as above and below the fold, Above would even have base64 images and all inlined, below would get analytics script loaded.

xpose2000 · on Feb 9, 2019

I've been testing it for the past 30 minutes or so and found that it doesn't cause the same problems that InstantClick did. (Which was javascript errors that would randomly occur.) I'll limit it to a small subset of users to see if any errors are reported but there is a good chance this could go live for all logged in users. Maybe even all website visitors if all goes well.

Seems to have no impact on any javascript, including ads. Pages do load faster, and I can see the prefetch working.

Just make sure you apply the data-no-instant tag to your logout link, otherwise it'll logout on mouseover.

hn_throwaway_99 · on Feb 9, 2019

> Just make sure you apply the data-no-instant tag to your logout link, otherwise it'll logout on mouseover.

Logout links should never be GETs in the first place - they change states and should be POSTs.

slim · on Feb 9, 2019

POSTs are not Links. And Logout service is indempotent even if you can consider it changes the state of the system

nardi · on Feb 10, 2019

Lots of people in this thread confusing “idempotent” with “safe” as specified in the HTTP RFC: https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

kbirkeland · on Feb 10, 2019

FWIW RFC 2616 was obsoleted by the newer HTTP/1.1 RFCs: https://tools.ietf.org/html/rfc7231#section-4.2

amenod · on Feb 10, 2019

Which still doesn't change GP's point though:

> In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe".

(there's an exception listed too, but doesn't apply to logout)

EDIT: I know of someone who made a backup of their wiki by simply doing a crawl - only to find out later that "delete this page" was implemented as links, and that the confirmation dialog only triggered if you had JS enabled. It was fun restoring the system.

kbirkeland · on Feb 11, 2019

I don't know why you think I'm contradicting them. I was just pointing out that there are newer RFCs. They also happen to have a stronger and more complete definition of safe methods.

Gaelan · on Feb 9, 2019

Ok, so make it a form/button styled to look like a link.

glacials · on Feb 9, 2019

Idempotency is not the issue, the issue is that a user might hover over the logout link, not click it, then move on to the rest of the site and find they are logged out for no reason.

leesalminen · on Feb 9, 2019

Right, which is why the included library includes an HTML attribute to disable prefetch on a given link.

pbreit · on Feb 9, 2019

OP’s point was that logout should not be implemented with a link/GET but instead with a button/POST for exactly this reason.

leesalminen · on Feb 9, 2019

A logout action is idempotent, though. You can't get logged out twice. In my opinion, that's the use case for a GET request.

I just checked NewRelic, Twilio, Stripe and GitHub. The first 3 logged out with a GET request and GitHub used a POST.

zepolen · on Feb 10, 2019

Idempotency has nothing to do with it. Deleting a resource is idempotent as well. You wouldn't do that via GET /delete

A GET request should never, ever change state. No buts.

Just because a bunch of well known sites use GET /logout to logout does not make it correct.

Doing anything else as demonstrated in this and other cases breaks web protocols, the right thing to do is:

GET /logout returns a page with a form button to logout POST /logout logs you out

derefr · on Feb 10, 2019

Depends on your definition of “state.” A GET to a dynamic resource can build that resource (by e.g. scraping some website or something—you can think of this as effectively what a reverse-proxy like Varnish is doing), and then cache that built resource. That cache is “state” that you’re mutating. You might also mutate, say, request metrics tables, or server logs. So it’s fine for a GET to cause things to happen—to change internal state.

The requirement on GETs is that it must result in no changes to the observed representational state transferred to any user: for any pair of GET requests a user might make, there must be no change to the representation transferred by one GET as a side-effect of submitting the other GET first.

If you are building dynamic pages, for example, then you must maintain the illusion that the resource representation “always was” what the GET that built the resource retrieved. A GET to a resource shouldn’t leak, in the transferred representation, any of the internal state mutated by the GET (e.g. access metrics.)

So, by this measure, the old-school “hit counter” images that incremented on every GET were incorrect: the GET causes a side-effect observable upon another GET (of the same resource), such that the ordering of your GETs matters.

But it wouldn’t be wrong to have a hit-counter-image resource at /hits?asof=[timestamp] (where [timestamp] is e.g. provided by client-side JS) that builds a dynamic representation based upon the historical value of a hit counter at quantized time N, and also increments the “current” bucket’s value upon access.

The difference between the two, is that the resource /hits?asof=N would never be retrieved until N, so it’s transferred representation can be defined to have “always been” the current value of the hit counter at time N, and then cached. Ordering of such requests doesn’t matter a bit; each one has a “natural value” for it’s transferred representation, such that out-of-order gets are fine (as long as you’re building the response from historical metrics,

zepolen · on Feb 11, 2019

Don't be a wise ass, with that definition state changes all the time in memory registers even when no requests are made.

> So, by this measure, the old-school “hit counter” images that incremented on every GET were incorrect

Yes they are incorrect. No Buts.

Two requests hitting that resource at the same exact timestamp would increase the counter once if a cache was in front of it.

klyrs · on Feb 10, 2019

That brings me back to the year 2001, when my boss's browser history introduced Alexa to our admin page and they spidered a bunch of [delete] links. cough cough good thing it was only links from the main page to data, and not the actual data. I spent the next few days fixing several of problems that conspired to make that happen...

Piskvorrr · on Feb 10, 2019

As in, anybody with a link to /delete could delete things? No identification/authentication/authorization needed?

klyrs · on Feb 10, 2019

> I spent the next few days fixing several of problems that conspired to make that happen...

Yes, I was a total n00b in 2001. But then, so was e-commerce.

klyrs · on Feb 10, 2019

and fwiw, I knew exactly how bad our security was... I kept my boss informed, but he had different priorities until Alexa "hacked" our mainpage :p

BorRagnarok · on Feb 11, 2019

If you're not allowed to change state on GET requests, how do you implement timed session expiration in your api? You can't track user activity, in any way, on get requests, but still have to remember when he was last active.

NewEntryHN · on Feb 9, 2019

Idempotence is for PUT requests. GET requests must not have side effects.

BorRagnarok · on Feb 11, 2019

I've heard this "get requests shouldn't have side effects" argument before, but I don't think it works. At least, not for me, or I'm doing something wrong.

For example: Let's implement authentication, where a user logs in to your api and receives a session id to send along with every api call for authentication. The session should automatically be invalidated after x hours of inactivity.

How would you track that inactivity time, if you're not allowed to change state on get requests?

hateful · on Feb 9, 2019

I think this is the argument for PUT instead of POST, not GET instead of POST.

hueving · on Feb 10, 2019

You're confusing idempotency and side effects. A GET should not have any side effects, even if they are idempotent.

dalore · on Feb 13, 2019

It's not about idempotency, but about side effects. The standards mention if it will cause side effects use POST. Logging out does cause side effect (you lose your login) and hence should be a POST.

In the old days it might have been acceptable to get away with a GET request but these days thanks to prefetching (like this very topic) it's frowned upon.

https://stackoverflow.com/questions/3521290/logout-get-or-po...

pbreit · on Feb 10, 2019

GET is also supposed to be “safe” in that it doesn’t change the resource which a logout would seem to violate.

The whole reason this is supposed to be the case is in order to enable such functionality as this instant thing.

abathur · on Feb 9, 2019

Also: sometimes a site is misbehaving (for myself, or maybe for a user we're helping) and it's helpful to directly navigate to /logout just to know everyone is rowing in the same direction.

Using a POST, especially if you're building through a framework that automatically applies CSRF to your forms, forecloses this possibility (unless you maintain a separate secret GET-supporting logout endpoint, I guess).

Guest10928391 · on Feb 10, 2019

When I originally started my community site I used GET for logout. However, users started trolling each other by posting links to log people out. It wasn't easy to control, because a user could post a link to a completely different site, which would then redirect to the logout link. So, I switched to POST with CSRF and never had another issue.

enedil · on Feb 9, 2019

That's exactly the problem with idempotency.

winstonewert · on Feb 9, 2019

actually, no. Idempotency means that you can safely do the same operation multiple times with the same effect as doing it once. That's a different issue than the no-side-effects rule which GET is supposed to follow.

ezequiel-garzon · on Feb 9, 2019

Thanks, how did you find out about data-no-instant tag?

sgustard · on Feb 9, 2019

https://instant.page/blacklist

js4ever · on Feb 9, 2019

Each time you hover over a link it's doing a GET request bypassing the cache (cache-control: max-age), even if you hove the same multiple times. Also this will make all your analytics false... Except that indeed this can improve greatly the user sensation of speed

mariopt · on Feb 9, 2019

The analytics should only be triggered is the page is rendered, assuming it's done client side. I believe Google does this for first top 3 results if I'm not mistaken.

michaelbuckbee · on Feb 9, 2019

This helped me quite a bit, my mental model of how this worked was off. Prefetching only downloads the resources but does not actually execute any js code. So sites doing lots of tag managers or other js loading js likely wouldn't benefit, but standard GA, etc. should be fine.

dspillett · on Feb 9, 2019

Added bonus: these extra pre-loads on hover will tell you if someone nearly clicked a link. Your web server logs contain some poor-man's eye tracking. Could aid determining if important parts of your pages (warning messages and so forth) gather enough attention.

ssl232 · on Feb 10, 2019

Seems like an invasion of privacy. I'd be surprised if someone hadn't used this sort of script for this already. Perhaps browsers or ad blockers will block this feature in the future.

ohmatt · on Feb 10, 2019

There's already other libraries out there like Fullstory, which tracks all the user mouse movements and interactions with the page and allows you to watch a user interact with your site in nearly realtime.

dspillett · on Feb 10, 2019

There are full tracking tools that do a lot more than this already - tracking all mouse movements and key presses. They are often used with the knowledge of users when testing UI changes/experiments, I've even heard of more general versions running as a browser plug-in for medical monitoring (watching for changes in coordination of people with degenerative conditions, without it feeling like an active test accidentally biasing results), though I would also be surprised not to find the idea already used more widely (and perhaps more nefariously) without users knowing.

hombre_fatal · on Feb 9, 2019

Good point.

What would be a good way to avoid prefetch requests in your analytics if you only derive analytics from server access logs?

8bitben · on Feb 9, 2019

Most browsers pass a "purpose:prefetch" or similar HTTP header in the prefetch request that you can use to differentiate

hueving · on Feb 10, 2019

Then how do you know when they actually go to the page? Do you need client side analytics at that point since the browser already has the page in memory?

Drakim · on Feb 9, 2019

You could obviously do it with JavaScript by pinging the server to tell him that log entry X is to be promoted to a true "hit".

For a non-JS solution, I guess an tiny iframe at the bottom of the html page that accesses a special server page with an unique stamp that causes the same "hit". the iframe loading would mean that the rest of the page is mostly loaded before it was closed.

jsjohnst · on Feb 10, 2019

> For a non-JS solution

How do you use this library with JS disabled?

Drakim · on Feb 10, 2019

The iframe would have to be part of the HTML document from the very start. Maybe a server-side pre-processor that appends it as it's served.

mygo · on Feb 9, 2019

If a GET request is all it takes for your analytics to be messed up, then your analytics are not reliable in this era of bots pinging everything.

You need to be emitting your analytics events from a rendered/executed page. Preferably with javascript, and a fallback <noscript> resource link to a tracking url can work here.

byron_fast · on Feb 9, 2019

What's more important - analytics, or the user experience. Damaging stats sucks, but this seems to outweigh that.

dentemple · on Feb 9, 2019

Clearly you're not in middle management.

williamdclt · on Feb 10, 2019

In a lot of legit cases, analytics. Especially versus a UX improvement that is clearly minor. The word "analytics" has bad rep but it's massively important to know if you're building the right product for your users, without a sort of analytics you're shooting in the dark

aiisjustanif · on Feb 10, 2019

Without analytics, I don't justify my project soooo

chatmasta · on Feb 9, 2019

Might be worth adding something like a Pre-Fetch: True header to pre-fetched requests. But then the problem is, if the user pre-fetches and then actually views it, how does your analytics know unless the client then sends another request?

drewmol · on Feb 10, 2019

So would setting erroneous pre fetch headers to all requests would help to spoil analytics insights?

cjblomqvist · on Feb 9, 2019

It seems everybody is missing this but this could actually slow down your experience, and I'd actually guess it will in some scenarios (ie. not only a theoretical situation).

Considering a user hovering over a bunch of links and then clicking the last one, and doing this in a second. Let's assume your site takes 3 sec to load (full round-trip) and you're server is only handling one request at a time (I'm not sure how often this is the case, but I wouldn't be surprised if that's the case within sessions for a significant amount of cases). Then the link the user clicked would actually be loaded last, after all the others - this probably drastically increase loading time.

The weak spot in this reasoning is the assumption that you're server won't handle these requests in parallel. Unfortunately I'm not experienced enough to know whether that happens or not, but if so, you should probably be careful and not think that the additional server load is the only downside (which part like is a negligible downside).

bherms · on Feb 9, 2019

It actually cancels the previous request when you hover over another link

collinmanderson · on Feb 9, 2019

Client side canceled doesn’t necessarily translate to server side canceled.

I used to use a preload-on-hover trick like this but decided to remove it once we started getting a lot of traffic. I was afraid I’d overload the server.

x15 · on Feb 9, 2019

I'd also hesitate wasting resources in such a way.

About your first statement though, which server software do you use that still sends data after the client has closed the connection? Doesn't it use hearbeats based on ACKs?

collinmanderson · on Feb 10, 2019

It doesn’t send the response to the client, but it still does all the work of generating the response.

I use nginx to proxy_pass to django+gunicorn via unix socket. I sometimes see 499 code responses in my nginx logs which I believe means that nginx received a response from the backend, but can’t send it to the client because the client canceled the request.

I admit I haven’t actually tested it directly, but I’ve always assumed the django request/response cycle doesn’t get aborted mid request.

hombre_fatal · on Feb 10, 2019

The server is still doing all of the work in its request handlers regardless of whether client closed the connection.

jsjohnst · on Feb 10, 2019

Not if the server is setup correctly.

hombre_fatal · on Feb 10, 2019

That doesn't make sense. You can't just "config a server" to do this. Even if a web framework tried to do this for you, it would add overhead to short queries, so it wouldn't be some universal drop-in "correct" answer.

Closing a connection to Postgres from the client doesn't even stop execution.

jsjohnst · on Feb 10, 2019

> You can't just "config a server" to do this.

Unless you are focusing on the word server and assuming that has nothing to do with the framework/code/etc, then I can assure you it can be done. I’ve done it multiple times for reasons similar to this situation. I profiled extensively, so I definitely know what work was done after client disconnect.

Many frameworks provide hooks for “client disconnect”. If you setup you’re environment (more appropriate term than server, admittedly) fully and properly, which isn’t something most do, you can definitely cancel a majority (if not all, depending on timing) of the “work” being done on a request.

> Closing a connection to Postgres from the client doesn't even stop execution.

There are multiple ways to do this. If your DB library exposes no methods to do it, there is always:

pg_cancel_backend() [0]

If you are using Java and JDBI, there is:

java.sql.PreparedStatement.cancel()

Which does cancel the running query.

If you are using Psycopg2 in Python, you’d call cancel() on the connection object (assuming you were in an async or threaded setting).

So yes, with a bunch of extra overhead in handler code, you could most definitely cancel DB queries in progress when a client disconnects.

[0] http://www.postgresql.org/docs/8.2/static/functions-admin.ht...

h1d · on Feb 10, 2019

I don't think it cancels database queries.

zamalek · on Feb 10, 2019

Depending on the framework it can. That's the purpose of the golang Context and C# CancellationToken.

Operyl · on Feb 10, 2019

I believe PHPFPM behaves in this way. When the client disconnects from the web server, their request stays in the queue for a worker to pick up, I don’t believe there is a way to cancel it.

CydeWeys · on Feb 9, 2019

Per my cursory reading of the source code it looks like it might only prefetch one link at a time: https://github.com/instantpage/instant.page/blob/master/inst...

If that's not how it works, it could easily be modified to add a throttle on how many links it will prefetch simultaneously.

stonewhite · on Feb 11, 2019

This library should be only handling cache logistics, moving a cdn cache to browser. Otherwise is ill advised because of the reasons specified.

Every site that can use a last mile performance optimization like this should already be serving everything from some form of cache, either from varnish or a cdn. So in theory, availability of the content should not be the problem.

shaklee3 · on Feb 10, 2019

It also has a minimum time of hovering, so if you're going by it quickly it won't fetch anything.

rurcliped · on Feb 9, 2019

Many people browse the web from an employer who has rules about what types of pages may be accessed. For example, a person applying for a job with my team may include a link to a web page about their job-related background -- portfolio.html or whatever. HR tells us to be sure we don't follow links to any other page that may be more personal in nature, such as a page that reveals the applicant's marital status (which can't legally be considered in hiring decisions here). HR doesn't want to deal with complications from cases where we reject an applicant but there's a log entry showing a visit to, say, family.html from our company's IP address block. We'd prefer that prefetching isn't a default.

There's also log analysis to identify the set of web pages visited by each employee during work hours, and an attempt to programmatically estimate the amount of non-work-related web browsing. This feeds into decisions about promotions/termination/etc. Prefetching won't get anyone automatically fired, but we'd still prefer it isn't a default.

thatcat · on Feb 9, 2019

If you need people to design their websites around your companies HR metrics collection for them to work, then your HR department's metrics are the problem. The easiest way to improve productivity at your company could be to drastically cut HR funding, maybe HR should collect some metrics on how much time HR spends building datasets that aren't accurate.

nohnce · on Feb 9, 2019

I don't think it's a matter of HR necessarily. HR has guidelines because a lot of their job has to do with compliance. Big companies absolutely need large HR departments because a lot more regulations apply.

hueving · on Feb 10, 2019

Which regulation states that promotions should be based on browser history?

mpnordland · on Feb 9, 2019

While this is an approach you could take, it's not a very user friendly one. I agree with you that those HR policies sound draconian, but OP isn't necessarily in a position to challenge or change them.

In addition to being less user friendly, having the mindset that users/visitors to your website must live and work in ideal settings means that whatever you create will tend to be fragile and brittle because you don't try to take into account situations that you haven't seen before.

hombre_fatal · on Feb 9, 2019

Jesus. I hope they pay you well for that.

I've heard a lot of stories of ridiculous rule-by-HR culture, but that's so extreme it sounds made up.

nextos · on Feb 9, 2019

I don't think it's made up, because I experienced the same thing in a pretty well-known European research center...

Of course, had I known about these practices in advance, I would have declined the job offer. But I didn't. I ended up quitting a few weeks later anyway.

IT would monitor all connections from all employees and send a report to upper management with summary statistics, on a monthly basis.

I was told this was the case by a fellow worker during my second day there, so I tunneled my traffic through my home server via SSH. When IT asked me why I had zero HTTP requests, I reminded them that monitoring employees traffic was illegal under our current legislation. Doing this in a university-like non-profit research center is hard to justify.

lucasverra · on Feb 10, 2019

So they asked you you are surfing the web on a insecure protocol that can compromise internal data ?

Couldn't you just say "I just dont use http anymore because this X company data is very valuable to me" ?

dingaling · on Feb 10, 2019

I don't see why it's hard to justify. They are providing facilities for you to perform the work they request, not for your personal benefit.

Invert the scenario: if they told you that you had to do work-based research on your own personal Internet connection, would that be OK? Any overage charges are yours to pay, no compensation.

strombofulous · on Feb 9, 2019

The part about viewing family.html seems kind of understandable. If you assume no bad actors, then it's crazy... But we're all developers here, we know that you have to assume the existence of bad actors, and assume that they are going to target you (which is why we you always validate data client data server-side). I could see how viewing family.html could turn into a real headache for HR/Legal, especially if the law says that you can't discriminate based on family information.

The other part about log analysis seems crazy, though, I agree with you on that.

hombre_fatal · on Feb 9, 2019

We're talking about evidence in a discrimination court case that points to an IP address associated with a company that visited /family.htm around the time someone applied for a job they didn't get. Like, that person went through their blog's access.log when they got home, defeatedly looking up IP addresses, and going "aha, jackpot!"? And everyone in the company hovers links to be sure they don't go to /as-a-black-man.htm during the hiring process? And the fear seriously is that prefetching might be what spurs this chain of events?

That sounds batshit insane.

scarejunba · on Feb 9, 2019

Yes, I can't see how anyone could think this is sane in any sense. Brb putting my kids in my GitHub profile picture.

hueving · on Feb 10, 2019

Just go right for the kill and put your marital status, ethnicity, and sexual preferences right below your name on your resume. That way they're trapped the instant they open it!

Gaelan · on Feb 9, 2019

A while back I saw a chrome extension that hid profile pictures on GitHub specifically for this reason.

gnulinux · on Feb 9, 2019

One of my biggest career goals is never to work in a company like that. A company that micromanages me to the point they search my fucking internet history to decide my termination doesn't deserve my technical skills.

saluki · on Feb 9, 2019

Double True!

jimnotgym · on Feb 9, 2019

The second hn item today that made me agree that developers need a union.

I always let my team browse Facebook, if they wanted. One of my top people browsed it the most out of everyone. If you block a page, then they will just use their phone.

If you are going to measure, then measure outputs. Measuring inputs will make you and your team equally unhappy.

ksec · on Feb 9, 2019

Measuring Output! Especially true for which their job require more thinking than actual labour. It doesn't matter how much time he spent on it, as long as it gets done. And more often, you actually need to play a game, relax or something different before coming back and solve the problem in seconds as compare to staring at it for hours and not getting anywhere.

UweSchmidt · on Feb 9, 2019

Sounds terrible: An automated analysis of an employee's behaviour linked to promotions and even terminations? Should be outright illegal.

The webbrowsing of other people is their private matter. If you think someone is surfing the web too much (or taking too many coffee breaks, or leaning on the shovel for a minute or any other normal activity people do to take little breaks from work), it's on leadership to tell the person to get back to work, or generally, create a work environment where work flows more naturally.

Logs are for investigations in case of crimes etc.

I read so many things here on HN that are illegal in Germany. We have laws and powerful worker representation that prevents dehumanitzing stuff like that but, often, things started in the US find their way over here....

dspillett · on Feb 9, 2019

> There's also log analysis to identify the set of web pages visited by each employee during work hours, and an attempt to programmatically estimate the amount of non-work-related web browsing.

If your company is doing that, then do not browse on company time and/or using company equipment at all. Ever. They obviously don't (or for regulatory reasons, can't) trust you, so you should treat them as an adversary for your own good.

Remember: HR exists to protect the company, not the employees.

bdcravens · on Feb 9, 2019

Any companies with such onerous policies could block sites like instant.page at the firewall.

mikeash · on Feb 9, 2019

Why would they block being handed extra leverage over their employees?

Existenceblinks · on Feb 9, 2019

I had seen that kind of firewall before when a sale presented what his firewall is capable of at company I worked with. I'm not sure how I felt back then.

I mean that firewall was used to track every website you browse and other evil stuff.

hueving · on Feb 10, 2019

>This feeds into decisions about promotions/termination/etc.

Where do you work that makes tolerating that level of idiotic behavior worth it? It the job super interesting or the pay above market rate? If not, there are much greener pastures my friend.

Companies that treat their employees like morons eventually push out everyone who is not one.

veilrap · on Feb 9, 2019

That sounds like hell.

mosselman · on Feb 9, 2019

That story about 'personal' pages sounds like an excuse for tracking what you do on your computer. I'd look for a different job if I were you to be honest. It sounds like a toxic environment to work in.

mcherm · on Feb 10, 2019

> Many people browse the web from an employer who has rules about what types of pages may be accessed.

I think only very few people browse the web from an employer who has rules about what types of pages can be fetched. As a web developer, if I can make a faster experience for 99% of my population at the cost of potentially annoying the HR department of some tiny fraction of them, I'm going to do it. And I won't feel bad about making it slightly more difficult for my site's visitors' management to effectively surveil their browsing habits -- not my problem!

apolretom · on Feb 9, 2019

Could you say what company this is, so that I can make sure I never work there?

z3t4 · on Feb 9, 2019

You can surf with JavaScript turned off, or use your private mobile phone.

elliekelly · on Feb 9, 2019

You should find a different company to work for.

arpa · on Feb 10, 2019

Just wondering, which continent are you located in?

Does your company tell its' employees about their Orwellian policies upfront when hiring, or is it a public secret?

How do you deal with encrypted traffic, e.g. https?

Some companies simply filter web traffic via corporate white/blacklists, maybe you have some insights why an illusion of freedom has been chosen in your company?

sarbaz · on Feb 9, 2019

Sounds like prefetching is good for you, since greater prefetching provides greater plausible deniability.

happppy · on Feb 11, 2019

Tbh I would have been fired the next day.

P.S: Using HN at my office. Was looking at career pages of other software companies a while ago.

bberrry · on Feb 10, 2019

I hope I never end up in your dystopian part of the world.

eknkc · on Feb 9, 2019

Is there a novel about this company?

rprime · on Feb 9, 2019

Is this a re-rebranding? I remember using something similar 4-5 years ago (instant.js/instantclick).

But quite an interesting little thing, especially useful for older websites to bring some life into them. The effect is very noticeable.

dieulot · on Feb 9, 2019

Kind of. It’s different than InstantClick in that it uses <link rel="prefetch"> and can thus be embedded in one minute, while InstantClick transforms a site into an SPA and require additional work.

It’s a different product. The initially planned name for it was “InstantClick Lite”.

rprime · on Feb 9, 2019

Oh interesting, I'll give this a go.

Existenceblinks · on Feb 9, 2019

I've seen this kind of features several times on libs. And sometimes I can't un-think about it while my mouse is hovering on links in normal life. It's .. prefetching or not?

pbhjpbhj · on Feb 9, 2019

Hmm, I wonder: if the user isn't asked to authorise the action whether it technically breaches the Computer Misuse Act (UK).

If you send a requested page, that's obviously fine. Normal use of websites is expected, but unilaterally instructing your site visitor's browser to download further unrequested content that's not part of a requested resource ...?

mcdevilkiller · on Feb 10, 2019

What about downloading tracking js you didn't ask for then?

paintballboi07 · on Feb 10, 2019

Since nothing is actually executed until you click on the link, the only issue would be those on metered connections.

jypepin · on Feb 9, 2019

This is a feature available on most web frameworks today (for example Link's prefetch on Next.js), but still could be very useful for smaller website and other static pages not using such frameworks.

I'd be a little wary of using a script from an unknown person without being able to look at the code - I'd rather see this open source before using. Especially being free and MIT licensed, I don't see why it wouldn't be open.

devinl · on Feb 9, 2019

In the technical details, he has a link to the open source on github. Here's the js that's actually doing the preloading: https://github.com/instantpage/instant.page/blob/master/inst...

jypepin · on Feb 9, 2019

I stand corrected then, thanks for sharing :) I missed that part!

glitcher · on Feb 9, 2019

Just go directly to the script url: https://instant.page/1.0.0

The code is not obfuscated or minified, very easy to read.

asaph · on Feb 9, 2019

It probably should be minified if the whole point is to improve page load times.

daxterspeed · on Feb 9, 2019

Compression and caching makes any minification of small scripts more than negligible.

asaph · on Feb 9, 2019

Perhaps you meant less than negligible? Or simply negligible? But not more.

hk__2 · on Feb 9, 2019

Minification is free and is done only once. In our case, the script is .9kb compressed (2.9kb uncompressed). When minified, it goes down to .6kb compressed (1.1kb uncompressed). It’s a small improvement, but there’s no reason to ignore it.

pbhjpbhj · on Feb 9, 2019

Saves over 50% of bandwidth ... "it's a small improvement". ^_^

hombre_fatal · on Feb 10, 2019

They mean the 300 bytes shaved from gzip to minified+gzip.

Both are under MTU size of TCP packet.

austinwm · on Feb 9, 2019

It appears that the source code [1] is linked from their Technical Details page.

[1] https://github.com/instantpage/instant.page

colvasaur · on Feb 10, 2019

Isn't everything client-side on the web inherently open-source?

danra · on Feb 9, 2019

Is there something similar available on Django?

samcodes · on Feb 9, 2019

This should be completely backend agnostic. I was never a Django person, but you’d just put that script tag in your main template so it loads on every page.

collinmanderson · on Feb 9, 2019

I use Django and have done similar things. yes this should be backend agnostic.

saagarjha · on Feb 9, 2019

Perhaps this something that a browser should be doing, instead of websites themselves?

gwern · on Feb 9, 2019

Enabling it by default Internet-wide seems like it could be a bad idea for many reasons. If I go out of my way to enable it on my site, I am taking responsibility for the bandwidth and any side-effects of prefetching a link and understand what I am doing. But if it is simply enabled Internet-wide, isn't that bordering on a DDoS? What about poorly-coded websites/apps where GETS are not idempotent or have side-effects? What about server-side analytics/logs tracking HTML downloads?

anoncake · on Feb 9, 2019

> What about poorly-coded websites/apps where GETS are not idempotent or have side-effects?

They're already broken, exposing that is a good thing.

lwansbrough · on Feb 9, 2019

Breaking the web is not a good thing. Regardless of how you think things should be done.

anoncake · on Feb 9, 2019

Indeed, breaking the web by misusing GET is not a good thing. By extension, keeping the web broken by not exposing this breakage is not a good thing either.

yoz-y · on Feb 9, 2019

Like mentioned in another comment, if somebody used a GET http link to logout from a webpage, you would end up with a ton of surprised users. People who read articles by highlighting the text with the mouse would also probably hover over all of the links and would end up wasting bandwidth for no reason.

anoncake · on Feb 9, 2019

> if somebody used a GET http link to logout from a webpage,

If you violate the standards, your website doesn't work. Who knew?

> People who read articles by highlighting the text with the mouse would also probably hover over all of the links and would end up wasting bandwidth for no reason.

"For no reason" is obviously wrong, making the web snappier is a reason.

Maybe browsers should only prefetch links on bloated websites since their owners clearly don't mind wasting bandwidth.

lwansbrough · on Feb 10, 2019

This is just an ignorant response. The history of the internet is littered with pragmatic solutions to standard vs. non-standard approaches for exactly these reasons. See: <image>, Referer header, the HTML standard as a whole.

By the way, if your standard contradicts a popular methodology, it's probably a bad standard.

anoncake · on Feb 12, 2019

> By the way, if your standard contradicts a popular methodology, it's probably a bad standard.

You can't assume a methodology is good just because it's popular. That's how you get cargo cults.

But if a methodology violates the standards, it's almost certainly bad.

m1sta_ · on Feb 9, 2019

Vulnerable isn't broken

hunter2_ · on Feb 9, 2019

The heuristics to exclude logout links and the like would be very disruptive. Those decisions need to be in the website author's hands.

However, I think if browsers had this, but off by default until seeing tags to enable it along with any exclusions, that would be great.

chrisseaton · on Feb 9, 2019

I think it would only prefetch GET links, which never have side effects.

hombre_fatal · on Feb 9, 2019

There's nothing stopping GET requests from having side effects.

It's like pointing to a list of best practices and saying "everyone surely follows these."

For example, someone changed their signature to `[img]/logout.php[/img]` on a forum I posted on as a kid and caused chaos. The mods couldn't remove it because, on the page that lets you modify a user's signature, it shows a signature preview. Good times.

arendtio · on Feb 9, 2019

I think it was a joke as GET requests are not supposed to change anything, but often they do (probably because many devs don't know about, understand or respect the RESTful concept).

EDIT: For completeness, I have to add, that I am also part of the group of people who have violated that concept. Maybe neither frequently nor recently, but I did it too :-/

chrisseaton · on Feb 9, 2019

> understand or respect the RESTful concept

It's nothing to do with REST. It's part of the HTTP spec and has always been, that "GET and HEAD methods should never have the significance of taking an action other than retrieval".

arendtio · on Feb 9, 2019

Well, if I am not mistaken, REST is just the articulated concept on which HTTP was built. So yes, the HTTP spec (probably) existed before REST became a term itself, but in the end, there is no reason to argue if REST defines it or HTTP.

chrisseaton · on Feb 9, 2019

> There's nothing stopping GET requests from having side effects.

> It's like pointing to a list of best practices and saying "everyone surely follows these."

It’s not a ‘best practice’ it’s literally the spec for the web.

hombre_fatal · on Feb 9, 2019

What percent of developers do you think have even read the RFC?

Browsers take a more practical approach than "well, it's in the spec, they should know better" which is apparently what you're suggesting.

It's the same reason browsers will do their best to render completely ridiculous (much less spec-complaint) HTML.

lixtra · on Feb 9, 2019

To prove your point: If I remember correctly HN votes are sent as GET.

tedunangst · on Feb 9, 2019

You're typing this comment on a site that has a GET link to logout.

hunter2_ · on Feb 9, 2019

This phrase "GET link" I keep seeing makes sense, but strikes me as odd. Is that to differentiate from an "a" tag that triggers JS that makes a fetch/xhr with another method? The only non-JS non-GET request I'm aware of is a form action (POST by default, GET if specified) which can hardly be called a link, unless I'm wrong to equate link with "a" tag.

Zarel · on Feb 9, 2019

Form actions are actually GET by default (think search forms). You need to explicitly use <form method="post"> for a POST form.

hunter2_ · on Feb 10, 2019

Ah, yep.

jfoster · on Feb 9, 2019

It could be a way for browsers to encourage GET to be used more correctly.

hombre_fatal · on Feb 9, 2019

Seems like you'd be punishing users instead of website operators since the cause/effect relationship is so unobvious.

User happens to brush over the logout button while using the site. On their next click, they're logged out. Weird. Guess I'll just log in again. Doesn't happen again for some time, but then it does. Weird, didn't that happen the other week? What's wrong with my browser? Oh cool, switching browsers fixed it. You're having that issue, too? Don't worry, I figured it out. Just switch browsers.

markovbot · on Feb 9, 2019

It doesn't have to be. Could start by allowing website authors to opt in via a tag in the <head> or something, then opt out on a per-link basis with an attribute (eg preload=false)

trsfgbb · on Feb 9, 2019

I remember using some web accelerator 15-20 years ago. Prefetching all links on webpages. It was really helpful on modem/ISDN connection.

Standalone program on Windows XP (or maybe Windows 98?). It had it's own window where you could see which pages it was loading.

Does anyone know the name?

thijsvandien · on Feb 9, 2019

It's funny how much more sense this made on old-fashioned dial-up connections. Back then, as far as I remember, there was no data limit as such. The only thing that counted was connection time. Rather than sitting there reading something while generating ticks, you could better download much of the site and disconnect. An old form of rush to idle.

Wowfunhappy · on Feb 9, 2019

Doing it by default seems a bit invasive, but I'd be interested in this as a configurable option or plugin!

benologist · on Feb 9, 2019

Embedding via // without explicit SSL should probably be considered harmful or malicious as there is no reason to make such scripts available without SSL. Even if the end website is not using SSL users can still fetch your script securely.

jakelazaroff · on Feb 9, 2019

The example snippet uses SRI [1], so there's no security issue with plain HTTP.

[1] https://developer.mozilla.org/en-US/docs/Web/Security/Subres...

Holybedd · on Feb 9, 2019

It's not supported by IE.

jakelazaroff · on Feb 9, 2019

IE doesn't support script type "module" anyway, so it'll ignore the script tag: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/sc...

dieulot · on Feb 9, 2019

There’s no security gain from going to HTTPS if the site is served over HTTP, but there’s a small speed hit.

benologist · on Feb 9, 2019

The communication between the user and example.com downloading the page referring to your script is secured by their SSL if they have it.

Separately to that, the communication between the user and your server when downloading your script is secured by your SSL. This can be secure even if example.com is not, so it should only be secure.

esrauch · on Feb 9, 2019

If the first html load isn't on SSL, and someone is able to intercept your traffic, they can change the embedded https url to be a non-https url anyway, so I can't even imagine the attack that is prevented by using https into something loaded over http.

benologist · on Feb 9, 2019

Absolutely correct. But this is the website owner's problem and their consequences for not using SSL. You can't help or prevent this because it's not your server, it's not your fault they enabled insecure communication that can be exploited.

When you forgo SSL on your own server someone can also intercept your script in exactly the same way, they don't need to hack the website embedding your script. Now they are your consequences, your fault there's no SSL, and your problem may be affecting everyone who embedded your script insecurely.

swiftcoder · on Feb 9, 2019

No site should be served over plain HTTP in 2019. Browsers and search engines are actively discouraging/downranking websites that don't use TLS at this point.

veryrandomdude · on Feb 9, 2019

None should be, but several are. Just the way that it is.

CydeWeys · on Feb 9, 2019

By the way, since .page is HSTS-preloaded, you may as well include https:// in the code snippet that includes the library. It'll avoid the http-to-https link rewriting internal redirect from happening when included from a non-secure site. It's a tiny performance improvement, but across millions of page views, it might add up.

bastawhiz · on Feb 10, 2019

If the browser sends a Referer header, the page the user is currently on will be sent over plaintext.

dmit · on Feb 10, 2019

For exactly this reason browsers don’t send a Referer header when an HTTP request is made from an HTTPS page. (Nor for any kind of request made from a local file.)

MrStonedOne · on Feb 9, 2019

No!

You should NEVER load javascript over https on a page that was served over http.

It gives a false sense of security that doesn't exist.

Because the source page was served over http, the source page can be modified by an attacker, making the script be loaded under ssl makes you think its protected. But its not since an attacker could just modify the script tag to remove the https bit in transit then modify the script thats now being loaded over http.

In sort, forcing the script to use https gives you no gain when the page that includes it is served over http and it tricks you into thinking that javascript asset is secure.

rebane2001 · on Feb 9, 2019

Who does this affect? The attacker can always modify the page however they want anyways and the https for the js source would mean the js itself cannot be tampered with

Does it help much? No. Should you use it? Yes.

johnmaguire · on Feb 9, 2019

Who is being tricked? I don't suspect the user is checking the script tags to see if they are secure. At the very least, this might stop basic ISP tampering used to inject bandwidth warnings, etc.

MrStonedOne · on Feb 14, 2019

The programmer that put the script tag in there.

IgorPartola · on Feb 9, 2019

Or just stop using HTTP like a normal person. Who still has pages served over anything hit HTTPS?

benologist · on Feb 9, 2019

     Who still has pages served over anything hit HTTPS?

Someone who wants to explore our emerging financial liability as we grow increasingly compelled by law to actively protect data in transit (that's SSL), at rest, and in distribution.

     Penalties for violations can be huge, as much as 20 
     million euros or 4% of the company's annual turnover.

https://blog.quttera.com/post/gdpr-and-website-security/