More

BCharlie · on Aug 25, 2020

I mention that the data that appears to be used for those purposes is sent again in a separate request to a separate end point, so we have two types of requests: last read location, and reading analytics. Sorry it wasn't clear, I'll try to improve the wording.

falcolas · on Aug 25, 2020

Will you also be updating and noting that the requests to Wikipedia and Bing are for explicit customer-benefiting features?

Might be worth noting that you can opt out of their data collection (on the e-reader, at a minimum) as well. Settings > Device Options > Advanced Options > Privacy or in the device management console in your account on amazon.com

danShumway · on Aug 25, 2020

The text in question:

> Highlighting or tapping any word will send the requests with the text to Bing Translate and Wikipedia, as well as back to Amazon.

Is there a reason why that text needs to be sent before the user clicks the "translate" button? Is there a reason why it needs to be sent to Amazon?

fouric · on Aug 25, 2020

> Is there a reason why that text needs to be sent before the user clicks the "translate" button?

Yes - UX latency. I would expect this kind of thing to take a few thousand milliseconds, and shaving off a few hundred milliseconds from between when the user highlights text and when they select "translate" is significant. The fact that this data is being sent to Wikipedia of all places further signals that the usage is likely to be innocuous.

Do I think that this is globally a good design decision? No, for both engineering and privacy reasons. There's definitely no good reason why it should be sent to Amazon at all.

falcolas · on Aug 25, 2020

> There's definitely no good reason why it should be sent to Amazon at all.

I was wracking my brain on this, and all I could come up with was "to independently verify the invoicing for Bing translations" and "how many times are people accessing the definition/translation and not highlighting". So, analytics, not something that explicitly benefits the user.

benmller313 · on Aug 25, 2020

Can we stop pretending that analytics don't explicitly benefit the user? Product Engineering organizations rely on analytics to improve user experiences.

naikrovek · on Aug 25, 2020

Analytics can be done less granularly and still benefit the user. Also, surely not every data point collected is used to benefit the user.

For example, Amazon doesn't need to know where I am when I request a definition or translation. If they're concerned about usage, they only need to know how many times I actually used one or both of those features per day, per week, or month. They don't need to know instantly every single time a word is highlighted.

edanm · on Aug 26, 2020

> Analytics can be done less granularly and still benefit the user. Also, surely not every data point collected is used to benefit the user.

How? For all we know, it isn't granular - it might be aggregated at the server level to hide specific user's actions. But they'd still need to be sending in the data from the device to the server.

wongarsu · on Aug 26, 2020

The device could keep a daily count of interesting actions, and sync that to analytics servers on a daily or weekly basis. That preserves 95% of legitimate use cases while leaking much less private data (like how my reading habits are distributed across the day)

edanm · on Aug 26, 2020

I mean, you're still collecting most of the problematic data. And you might legitimately be interested in what you're leaving out - knowing time of day that people do things is actually important for plenty of use cases.

_abox · on Aug 25, 2020

They can but they're often much more than that.

Also it should really be opt in. Our at least opt out. I hate Amazon looking over my shoulder while reading a book.

patcul100 · on Aug 25, 2020

That's why I don't use their reading app and use a custom OS.

SamReidHughes · on Aug 26, 2020

The word choice we want here is directly vs. indirectly.

throwaway_pdp09 · on Aug 25, 2020

I'm surprised you'd say that. Out of interest, how does analytics help websites not use blathery, unhelpful text in overly-small fonts, done too-pale to make them unreadable. A lot of UI failings are of this most basic kind.

luckylion · on Aug 25, 2020

When you play an online slot game where you bet money that some numbers will appear on screen, and they use analytics to "improve user experience" (read: engagement, read: you losing more money), is that benefiting you or is it benefiting them?

bberenberg · on Aug 25, 2020

Kindle devices have a dictionary on device. By looking into which words are most frequently defined, they can add these to the local dictionary to help improve the speed of the UI.

freeone3000 · on Aug 25, 2020

The screen refresh rate on these devices is measured in seconds, so a few hundred millis of network latency is impossible to display.

fouric · on Aug 25, 2020

This isn't universally true - Dan Luu's computer latency page[1] lists three Kindles, all below 900 ms of latency. And, since some devices have latency as low as 570 ms, it makes sense that they would use this optimization.

[1] https://danluu.com/input-lag/

notatoad · on Aug 25, 2020

have you actually used a kindle? it certainly doens't take seconds for the definitions to pop up. a full-page refresh might take a second, but most page turns or UI interactions are partial draws and are much faster.

II2II · on Aug 26, 2020

I suspect that they were overstating a limitation of these devices rather than speaking from inexperience. While it has been years since I've used a Kindle, I do use Kobo devices and the delays are perceptible. While changing a page may be quite quick, user interface elements (such as a box containing a definition) seem to take longer. I suspect that they have to be more agressive when refreshing the screen before and after these user interface elements are displayed in order to make the ghosting less perceptible.

If you want to see what I mean by the ghosting of user interface elements being more perceptible, try using KOReader. The ghosting after using a menu can be quite noticable (at least on Kobo devices, which are based on the same technology).

dahfizz · on Aug 26, 2020

You're exaggerating how slow the screens are.

And the fact that the screens are slow should be motivation to make the rest of the system as responsive as possible. A good software engineer will work around bottlenecks, not shrug their shoulders and introduce new ones.

lapetitejort · on Aug 25, 2020

Also remember that "Kindle" can refer to an app on your phone or desktop computer, all of which may share code related to highlighting and translating.

shajznnckfke · on Aug 25, 2020

That doesn’t seem right. Let’s consider the screen refresh to be like a subway station, where the train shows up every few seconds. We need the text we want to show to the user to be at the stop waiting when the train arrives. If we miss the train, we need to wait for the next train to get our text on the screen. The network latency delays when we show up to wait at the station.

If the refresh rate is 5 seconds, and the network response time is 500ms, than eliminating the 500ms response time means we are 10% less likely to miss the train. On average, the time for the text to appear on the screen decreases by 500ms.

All this assumes the refreshes happening on a static schedule. If the software can trigger the refresh, then it’s a lot simpler. The 500ms improvement in latency would apply equally to every engagement with the translate feature.

freeone3000 · on Aug 25, 2020

There's no static schedule. It's an e-ink display. Refreshes happen when software tells it to display something new and take several hundred millis per blank - and a screen can be up to three blanks (because if it doesn't go white-black-display, then some pixels get stuck "on" or "off" or "halfway").

shajznnckfke · on Aug 25, 2020

In that case, it’s clear that eliminating the network request before triggering the refresh directly reduces the amount of time the user has to wait to see the result.

majormajor · on Aug 25, 2020

There isn't a "translate button" - the selection of the word i the button for define/translate/wiki. You swipe between the three cards.

I like this, as a user. I don't want MORE buttons to tap through when I'm trying to define or translate a word. Especially since the Kindle eink screen and UI is not the most responsive.

jonahrd · on Aug 25, 2020

This is literally my #1 used feature of my Kindle. I read texts in different languages to have a quick access to single-tap translations.

If it took 2 taps, I would switch platforms.

larrik · on Aug 25, 2020

On the iOS app it all appears instantly-ish when I highlight, so I'm guessing it's just the same codebase.

halbritt · on Aug 25, 2020

> Might be worth noting that you can opt out of their data collection (on the e-reader, at a minimum) as well. Settings > Device Options > Advanced Options > Privacy or in the device management console in your account on amazon.com

Good tip, I'm going to give this a whirl. Unfortunately, all the network calls add a significant amount of latency even if one didn't care about privacy.

TedDoesntTalk · on Aug 25, 2020

Can you provide the URLs so we can use pihole to block the requests?

boogies · on Aug 25, 2020

(off-topic) What’re the advantages of pihole over /etc/hosts?

_abox · on Aug 25, 2020

That it works for all devices on your network. Even ones that don't have an etc/hosts :)

garblegarble · on Aug 25, 2020

>(off-topic) What’re the advantages of pihole over /etc/hosts?

It's good for cases exactly like this - devices where you don't have control over /etc/hosts (or where you have lots of them and don't want to keep the hosts files in sync). I use it for my Samsung TV to keep them from phoning home (but still letting me use apps)

Edit: you can also set up a DoH endpoint and filter traffic while also allowing Encrypted SNI to work

thaumasiotes · on Aug 26, 2020

> It's good for cases exactly like this - devices where you don't have control over /etc/hosts

Is the pihole a DNS server or a firewall? Sibling comments suggest it's a DNS server, but that doesn't answer this need at all -- if you don't control /etc/hosts, you don't control the device. It can do its resolution however it wants. Most obviously, it can include the domain names you don't want it to reach in its own /etc/hosts file, which you just said you didn't control.

stock_toaster · on Aug 26, 2020

In addition to sibling replies which point out network-wide usefulness... pihole (or any dns server) can/will return NXDOMAIN instead /etc/hosts which will only return an ip. A dns server can also be configured to match a domain and any subdomain (wildcard match) without having to specify each entry individually.

_underfl0w_ · on Aug 25, 2020

They both work similarly if you're using them to block outbound requests, but a Pi-Hole would intercept and block outbound requests for every device on the network where it's installed, whereas editing /etc/hosts would only block requests on a single device (unless that device is your router, I guess?)

Hitton · on Aug 25, 2020

I liked the article. If you are gonna update it, please consider also mentioning technical aspect. Frankly, Amazon snooping on users is to be expected, but short mention of app for which platform have you analysed using which tools would be welcome addition.

ballenf · on Aug 25, 2020

> Frankly, Amazon snooping on users is to be expected

Snooping on users during e-commerce transactions, sure.

But recording user's detailed interactions with every ebook? I hope that's a big surprise to your average Kindle user.

It would be great to see a data request response and how much of this data is retained and for how long. It's clearly not anonymized at the request level.

Very easy to see a future where just reading certain books or reading certain books too many times could flag you as dangerous or be used to support a mental incompetence hearing resulting in loss of rights.

thaumasiotes · on Aug 26, 2020

> But recording user's detailed interactions with every ebook? I hope that's a big surprise to your average Kindle user.

I doubt it. Here are some features the Kindle phone app intentionally advertises to the user:

- prediction of how long the book will take to complete, based on your reading rate

- tracking of whether or not you read anything on any given day

BCharlie · on Aug 25, 2020

That's a great idea! It looks like the 'bad' stuff goes to unagi-na.amazon.com

eterps · on Aug 25, 2020

Exactly, I also added:

mobile-app-expan.amazon.com

cde-ta-g7g.amazon.com

paranorman · on Aug 25, 2020

Awesome, thanks for the quick reply. I'll add this to my pihole config.

BCharlie · on Aug 25, 2020

Though I haven't analyzed other devices (because I don't own them), they could easily have similar issues. I personally really want an open e-ink device, but I haven't seen one for sale unfortunately. For now, I do Calibre ODPS server with Marvin app on a phone, but it doesn't really compare.

BCharlie · on May 20, 2020

Yikes! Maybe that's the next thing I will take a look at...

BCharlie · on May 20, 2020

That makes a lot of sense. I assumed it was somehow for anti-fraud, though I still don't like it.

BCharlie · on March 3, 2020

I wish I knew of some way to change this paradigm. I have repeatedly seen IT/Engineering teams pull out miracles that save a business, or deliver the critical edge for growth, only to have the business value that same team close to zero shortly after.

It seems to me the gap in time between reducing staffing and disaster is perhaps too long for intuitive connections to be made, but that seems overly simplistic to me too.

HelloNurse · on March 3, 2020

A well-run company doesn't need "miracles". They paid the price of unpreparedness, and successfully rebuilding servers doesn't imply learning lessons and improving processes.

BCharlie · on March 3, 2020

I completely agree. I was only pointing out that even extreme cases don't seem to lead to proper valuations.

Many companies do things in a way where miracles aren't required - but the values perceived by those departments still don't seem in line with value delivered.

angry_octet · on March 3, 2020

Yeah being unprepared for NSA 0-day exploit chains, repacked by the Russians to attack Ukraine, really comes across as normal business conditions.

And after such an attack, you decide lowest cost is the primary IT driver?

ianbutler · on March 3, 2020

Many times purely business folks don't take the time to understand tech as it has a higher initial learning curve and so its a black box domain for them.

Not that they need to achieve high mastery over it but a little understanding goes a long way.

I've seen it time and time again, since they don't get it they fear it and the seeming power it has over their company. They treat it like a force of nature -- something that just is instead of another business domain to learn something about and so when things break there is this weird abdication of self determination like there was no possible way to have prevented things from going down or reaching a point of no recovery.

This all in turn leads to the lower valuation of tech instead of say sales or marketing.

heavenlyblue · on March 3, 2020

So what miracle happened here exactly?

They hired a British company to oversee the recovery from a disaster?

I am pretty certain this is a typical post-Brexit piece about the EU not needing these British companies.

Also Maidenhead is a corporate paradise, not some village in the middle of nowhere afraid of loosing jobs.

BCharlie · on March 3, 2020

Perhaps calling it an heroic effort would suit your taste more?

"Those teams rebuilt around 4,000 servers and 45,000 PCs and other devices" over 10 days (according to another article), while the company "ground to a halt".

I don't know about you, but if I had a team that pulled that out, I would have a deep respect for their service and contribution to the business.

Maybe I am missing something, but I didn't see anywhere where this was a company brought in, I am under the impression it is the company's internal staff recovering from ransomware, and now being laid off.

paulkearney · on March 3, 2020

Can you provide examples?

My broad (20 years) experience is that IT is usually later than forecast and often the 'business' make up the shortfall by producing miracles/take the pain.

rossdavidh · on March 3, 2020

Well, we don't hear about the ones where the IT/Engineering team IS properly valued.

BCharlie · on July 18, 2019

Thanks for this response - lot's of new information here that I'll have to read up on!

BCharlie · on July 18, 2019

You are right on this - I thought you could set multiple sites by setting multiple headers, but it doesn't work that way, which I should have known because headers don't work that way in general...

The recommended way to do multiple sites seems to be to have the server read the request header, check it against a whitelist, then dynamically respond with it, which seems terrible.

Thanks for catching this - I updated the post to reflect this and make it more clear.

unilynx · on July 18, 2019

Actually, headers _do_ often work that way. HTTP says:

Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list

Which applies to HTTP headers such as Cache-Control:, and probably goes back to the email RFCs allowing multiple To: headers.

It's just that Access-Control-Allow-Origin isn't defined to accept a comma list, just like Content-Security-Policy doesn't (which is another header breaking things if it appears more than once)

paulddraper · on July 20, 2019

Headers usually work exactly that way. Cookies and cors are oddball exceptions

BCharlie · on July 18, 2019

Great point! I added a sentence to say that the default is all that's needed.

BCharlie · on July 18, 2019

That is true! I do set frame-ancestors in the sample CSP for this reason. I could probably do a dedicated post on CSP to do it justice, but don't want to overwhelm anyone who just wants to start setting headers.

One good reason to set both options, as I mention in the post, is that scanners who rate site security posture may penalize site owners who don't set both - no harm in doing it that I know of.

Jach · on July 18, 2019

Nitpicking as I like to see practical awareness posts like yours spread: you should link to the CSP spec (v3) as the official site. https://content-security-policy.com/ is useful to get started but is out of date and Foundeo isn't authoritative.

For a "complete" guide (maybe "comprehensive starter guide"?) I'd at least add a note in the x-frame-options section that it's been superseded by CSP and only needed if you must support IE (or I guess please a tool), and if you have interesting frame requirements (i.e. more than one allowed ancestor but not all) you're going to have to use a hack to support that with the old header.

Another interesting callout is that most of the CSP directives can be specified by a meta tag in the markup. Not only is this handy for quick serverless testing but can become necessary if you end up routing through something (like some CDNs) that has a max overall headers limit... CSP headers can get pretty big if you don't just bail out with a wildcard.

Definitely agree CSP can have its own post. It's complicated and still evolving with new spec versions. I recently learned about chrome's feature-policy header proposal, which to me is like more granular script-src policies, so I wouldn't be surprised if some future CSP version just absorbs it...

BCharlie · on July 18, 2019

Thanks for the feedback! I did link the official site, but it's kinda buried in the paragraph and maybe not obvious.

I added some text to the x-frame-options to note the CSP rules - it's a great addition.

Jach · on July 18, 2019

Thanks for considering! I think I wrote the nitpick poorly, it's still early for me. I meant that you're currently linking to https://content-security-policy.com/ as the "official site" but it's not really, just a useful reference (but great to link to and in any case it does link to the official CSP2 recommendation eventually so you're fine). The most "official site" though at the moment is the combination of https://www.w3.org/TR/CSP2/ and the newer https://www.w3.org/TR/CSP3/ that's already implemented by Chrome.

I've reminded myself that v3 still hasn't fully stabilized into an official recommendation despite being in final-draft since October (it's basically closed for new things) so for now awareness of 2 and 3 is probably going to continue to be important for anyone responsible for producing a moderately complex string (guess who that is on my teams ;)). Though even at just level 2 there are a few things I could say about differences in behavior just between Chrome and Firefox... Testing is crucial!

HN For You