Steve Loughran

2022-08-02

Transitive Issues

i am not going to discuss anything sensitive which gets discussed in the hadoop security list, but i do want to post my reply to someone giving us a list of those artifacts with known CVEs, either directly or in their own shaded packaging of dependent libraries (jackson, mainly), then effectively demanding an immediate fix of them all

I my response is essentially "how do we do that without breaking everything downstream to the extent that nobody will upgrade?". Which is not me trying to dismiss their complaint, rather "if anyone has the answer to this problem I would really love to know".

I do regret that failure to get the OSGi support in to hadoop 0.1x; then we could have had that level of isolation. But OSGi does have its own issues, hence a lack of enthusiasm. But would it be worse than the state we have today?

The message. I am trying to do as much as I can via macos dictation, which invariably requires a review and fixup afterwards. if things are confused in places, it means I didn't review properly. As to who sent the email, that doesn't matter. It's good that transitive dependency issues are viewed a significant concern, bad that there's no obvious solution here apart from "I say we dust off and nuke the site from orbit"

(Photo: winter mountaineering in the Brecon Beacons, 1996 (?). Wondering how best to ski down the North Face of Pen y Fan)

Thank you for this list.

Except in the special case of "our own CVEs", all hadoop development is in public, including issue tracking, which can be searched under https://issues.apache.org/jira/

I recommend you search for upgrades of hadoop, hdfs, mapreduce and yarn, identify the dependencies you are worried about, follow the JIRA issues, and, ideally, help get them in by testing, if not actually contributing the code. If there are no relevant JIRAs, and please create them,

I have just made an release candidate for hadoop 3.3.4, for which, I have attached the announcement. please look at the changelog to see what has changed.

We are not upgrading everything in that list. There is a fundamental reason for this. Many of these upgrades are not compatible. While we can modify the hadoop code itself to support those changes, it means a release has become transitively incompatible. That is: even if we did everything we can to make sure our code does not break anything it is still going to break things because of those dependencies I. And as a result people aren't going to upgrade.

Take one example: HADOOP-13386 Upgrade Avro to 1.9.2.

This is marked as an incompatible update, "Java classes generated from previous versions of avro will need to be recompiled". If we ship that all your applications are going to break. As well everyone else's.

jersey updates are another source of constant pain, as the update to v2 breaks all v1 apps, and the two artifacts don't coexist. We had to fix that by using a custom release of jersey which doesn't use jackson.

HADOOP-15983 Use jersey-json that is built to use jackson2

So what do we do?

We upgrade everything and issue an incompatible release? Because if we do that we know that many applications will not upgrade and we will end up having to maintain the old version anyway. I'm 100% confident that this is true because we still have to do releases of Hadoop 2 with fixes for our own CVEs.

Or, do we try and safely upgrade everything we can and work with the downstream projects to help them upgrade their versions of Apache Hadoop so at least the attack surface is reduced?

This is not hypothetical. If I look at two pieces of work I have been involved in recently, or at least tracking.

PARQUET-2158. Upgrade Hadoop dependency to version 3.2.0.

That moves parquet's own dependency from hadoop 2.10 to 3.2.0, so it will actually compile and run against them. People will be able to run it against 3.3.4 too... but at least this way we have set the bare minimum to being a branch which has security fixes on.

HIVE-24484. Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

This is an example of a team doing a major update; again it helps bring them more up-to-date with all their dependences as well as our own CVEs. From the github pull request you can see how things break, both from our own code (generally unintentionally) and from changes in those transitive dependencies. As a result of those breakages hive and tez have held back a long time.

One of the patches which is in 3.3.4 is intended to help that team

HADOOP-18332. Remove rs-api dependency by downgrading jackson to 2.12.7.

This is where we downgraded jackson from the 2.13.2.2 version of Hadoop 3.3.3 to version 2.12.7. This is still up to date with jackson CVEs, but by downgrading we can exclude its transitive dependency on the javax.ws.rs-api library, so Tez can upgrade, thus Hive. Once Hive works against Hadoop 3.3.x, we can get Apache Iceberg onto that version as well. But if the release was incompatible in ways that they considered a blocker, that wouldn't happen.

It really is a losing battle. Given your obvious concerns in this area I would love to have your suggestions as to how the entire Java software ecosystem –for that is what it is –can address the inherent conflict between the need to maintain the entire transitive set of dependencies for security reasons

A key challenge is the fact that often these update breaks things two away -a detail you often do not discover- until you ship. The only mitigation which has evolved is shading, having your own private copy of the binaries. Which as you note, makes it impossible for downstream projects to upgrade themselves.

What can you and your employers do to help?

All open source projects depend on the contributions of developers and users. Anything your company's engineering teams can do to help here will be very welcome. At the very least know that you have three days to qualify that 3.3.4 release to make sure that it does not break your deployed systems. If it does work, you should update all production system ASAP. If it turns out there is an incompatibility during this RC face we will hold the build and do our best to address. If you discover an problem after thursday, then it will not be addressed until the next release which you cannot expect to see until September, October or later. You can still help then by providing engineering resources to help validate that release. If you have any continuous integration tooling set up: check out and build the source tree and then try to compile and test your own products against the builds of hadoop and any other parts of the Apache Open Source and Big Data stack on which you depend.

To conclude then, I'd like to welcome you to participating in the eternal challenge of trying to keep those libraries up to date. Please join in. I would note that we are also looking for people with JavaScript skills as the yarn UI needs work and that is completely beyond my level of expertise.

If you aren't able to do this and yet you still require all dependencies to be up-to-date, I'm going to suggest you build and test your own software stack using Hadoop 3.4.0 as part of it. You would of course need to start with up-to-date versions of Jersey, Jackson, google guava, Amazon AWS and the like before you even get that far. However, the experience you get in trying to make this all work will again be highly beneficial to everyone.

Thanks,

Steve Loughran.

-----

[VOTE] Release Apache Hadoop 3.3.4

I have put together a release candidate (RC1) for Hadoop 3.3.4

The RC is available at:

https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.4-RC1/

The git tag is release-3.3.4-RC1, commit a585a73c3e0

The maven artifacts are staged at

https://repository.apache.org/content/repositories/orgapachehadoop-1358/

You can find my public key at:

https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

Change log

https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.4-RC1/CHANGELOG.md

Release notes

https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.4-RC1/RELEASENOTES.md

There's a very small number of changes, primarily critical code/packaging

issues and security fixes.

See the release notes for details.

Please try the release and vote. The vote will run for 5 days.

2021-11-28

Achievement unlocked. Collarbones

(dictated in four different systems with minor punctuation fixups, hence the confused case of all the words and a bit of a disjointed feel. some of the worst transcription errors have been fixed by one-handed typing, but a lot left in to show the tools' awfulness. Ive also added some Opinions About how bad speech recognition is.)

As of Thursday afternoon.I now possess.4 collarbones. or one collarbone in four pieces if you look at it that way.

This was not intentional.And for the curious desn't actually hurt, provided I don't actually move it at all. But.I am on the high quality drugs The NHS provided for me. Something with.Codeine in.

I had set off for a.End of Autumn mountain bike ride on a sunny but cool day.And had made it over the bridge.2.The Ashton Court park.I Have ridden., many, many times.

[Side note.I tried dictating this.ThroughGoogle document speech recognition.Full stop.Period. Period.I think I will give a talk next year on the sheer awfulness of the different speech recognition.Systems built into Mac, Windows and.Google.Office.And how?They let down.Anybody?Who cannot actually use a keyboard?The only one.That is vaguely usable.Is.Mac OS offline voice recognition.HyphenBut that still has many flaws.Including.Its use of proper nouns.And the fact that the product is really.Unmaintained. Everyones R&D budget.Is clearly being spent on online speech recognition.With a focus on phones.And short text messages.Full stop.She's where punctuation and correcting what you have typed.Ah.Not considered important .]

[update: google docs speech recognition has just stopped working. im using my windows laptop as a tablet with the on screen keyboard. I'm trying Windows dictation, or, as the product should be known, "Microsoft Something went wrong -try again in a little while"]

returning to why i cannot use a mechanical keyboard, strava shows the end of my journey.

i swung off the road, through the gap in the railings and onto the trail, as i have done many times before except this time,I find myself launched into the air with the bicycle.

I have no idea what went wrong this wasn't a sideways front wheel slide out, more the kind of forward launch you'd do if you went over a 50cm+ dropoff without keeping your weight back Except that there was no dropoff. Maybe I just wasn't holding on to the bars tightly enough for the transition from tarmac to trail and the front wheel just twisted enough to trigger the Physics Event

strava says i was doing 20 kmh, so 72 kg of cyclist, 13 kg of steel hardtail MTB and a few kg of baggage makes for ~. 1400 Joules of kinetic energy. The specific heat capacity of the human body is 3.5 J/g,

Its a definite design failure of humans that we cannot disperse that excess energy bye converting it into heat. If we could I'd have been 0.005 Kelvins warmer

I would have waited a few seconds to cool down before heading on my way. Sadly, we don't work like that, the energy has just moved bits of my interior around.

I sit up while some people nearby run over to make sure I'm OK. It was clearly dramatic. I feel a bit bashed but no pain. However the general rule for mountain bike related crashes is.: sit down and wait for the adrenaline wave to pass and then you can actually assess how injured you are.

This time while I wait for that wave to pass I do actually feel under my jacket to see what my chest is like and where I normally have a collar bone I can feel some things moving under my skin. This is not good.

I've never actually broken anything before in my life. I have a far number of dents bruises scars and other bits of damage collected over the years. I haven't been able to run since 2007 on account of tendon damage on my left leg. And the back of my right leg has a set of scars the exact same shape and radius as a chainring. But never breakage. At least here I'm only a couple of miles away from home. I had initially considered just walking back pushing the bike but given the state of those bones that's not gonna happen.

Phone up my wife who is at home waiting for a replacement dishwasher to be delivered. It's not arrived yet and the status update implies it is at least half an hour away. This gives her enough time to drive over collect me and bring me home. The crash's audience- a couple visiting Bristol- stay around with me until then and help load the bicycle. Then it's home and onto accident and emergency.

The drive home is fine except every time we go over a bump it hurts. Anyone who knows Bristol will appreciate it's going to hurt quite a lot on account of the roads are slowly dissolving into a state which archaeologists wouldn't consider up to standard of Neolithic hunting trails. I realise I made a mistake here. We have a first aid kit in the car and i should have put my arm in a sling and i should have put my arm in a sling straightaway. not for the drive but because you need to factor in the time you'll be sitting in the hospital. .

(that section, repetitions included but not the maths, was dictated using Microsoft "something went wrong" . I worked out that even though Google docs and visual studio don't take dictation (why not?) Notepad does. So I can dictate into notepad for a while Then copy and paste It into Google Docs. Just like all the other server side NLP-as-a-service product you can only dictate for a short period of time before it makes a beeping sound and stops then you have to press a key to start it again. That's a really great idea isn't it? To add a product to help people with accessibility issues interact with your computer by having to press windows+ H *simultaneously* on the keyboard every 30 seconds to say "no I still don't have the ability to use a keyboard"? what product manager felt that was acceptable?

but, it suddenly switched to a "let's go back and overwrite everything you've just typed mode" and I could not get out of it. I do not want to bring up window services manager and start killing things just to see if that makes a difference or reboot the system. After all, if it is the servers that are sending the wrong instructions back to the OS it's not going to make any difference whatsoever.

I've switched to an iPad.

iOS speech recognition is another "speech misunderstanding as a service" Product . When I do get round to giving that talk about the mediocrity of online speech recognition services, I will cite Apple products as an example of unconscious class bias in speech recognition based on datasets skewed to your existing customer base.

As long as I put on a measured middle-class southern English accent "smug NW3 postcode mode" it seems to understand what i say. Use my default accent, "Excited NW6" and it gets less reliable. I fear for anyone with a strong Glasgow or Liverpool accent trying to get Siri to do anything.

down to A&E. 10 minutes walk; less bumpy than a drive. For Americans, A&E is a bit like ER only less expensive. you don't even get asked for credit card and ID before they let you sit down.

You also get a broad view of a diverse city with the selection bias that everybody in there has a health issue they consider urgent. I'm sitting one seat away from a Somali woman who alternates between talking on the phone to weeping quietly. She says she doesn't need any help. Nearby a French-speaking father and son await attention; the son's foot does not point the correct way out of a leg. And while I wait the police wheel in someone wearing those white paper oversuits they always showing police dramas when the forensic team are trying to work out how someone died. Trivia: plastic bags they were on their feet have flat soles So they don't leave footprints. Judging by the way the patient's leg is held out they may not deliver much traction. I do wish however that more of the people in the room would wear face masks, and of those that do, it's time they should've learned how they work and that they need to cover the nose. I am glad I had my third booster shot a few weeks ago.

after an hour sitting in the chair with my arm slumped down by my side I am invited in for assessment by a nurse. Before looking at the shoulder she looks for any other injuries discusses whether I banged my head how does my neck feel can I turn it et cetera et cetera all good. What about your shoulder pain she asks on a scale of one to 10? 2 to 3 as long as I don't move it I reply. she touches the collarbone. As tears spontaneously come out of my eyes I say "that's a bit more. "She agrees I'll be needing an x-ray and sending me off to a different waiting room.

One thing I've learned from a couple of other visits to the hospital is that it is good to have a bag with things that are useful. Phone charger for example, something to read and epilepsy medicines. This time I've brought a short sleeved shirt which buttons up at the front. Will be needing that soon. the X-ray waiting area is nearly empty. Signs up around the room are addressed to victims of domestic violence and giving them phone numbers to call for help. That's not a good sign of how some adults end up in that part of the hospital. In the kids x-ray section it's all about why you should be more careful on trampolines… though by that time it's a bit late.

in the x-ray room they help me get my outer jacket off, then they cut off the inner cycling top completely to put it off easily. they did offer to see if we can get it off unscathed but I lack the sentimentality to want to hurt myself quite that much. Then I get to participate in the second major physics event of the day. This time I'm the target of a low-luminosity beam of photons in the 5-10 KeV range while the radiologist runs to a safe distance -as they should.

back in the main waiting room now wearing a gown. Someone comes in screaming about brickwork in their eyeball. They get priority, While the rest of us make a mental note About the value of safety glasses

I sit around another half hour before I'm pulled in for my results. "you have broken your collarbone -but it looks like it will heal without any intervention which is good because there is not much intervention we can do". I put my replacement shirt on and I'm given a sling and some quality prescription painkillers. I go home, have some food and something to drink – I was so thirsty but in case I was going in for surgery I hadn't drunk anything since the crash.

Up into the living room where we pack cushions around me until I'm in a comfortable position. This is pretty much where I'm staying right now; I've been sleeping here too.

apparently the first few days are the most painful, I am trying to move very carefully and not to use that arm at all. Provided I take the painkillers every few hours and don't use the arm it is mostly okay. I would really like to drink a beer but the prescription forbids it. oh well, sometimes you have to prioritise.

now what? Well, I don't have any follow-up visits with the hospital arranged. I stuck my x-ray photo up on the orange riders Facebook group and got feedback from all the other people that have done similar things. It splits between " I did that and I was back on my bike within six weeks" and "look at the pins they've stuck in my bone after things didn't get any better after eight weeks. They go beep through an airport metal detector and it hurts in the cold". currently I'm hoping for the heal on their own outcome.

The fact that I can't type and that speech recognition across all the various platforms is so awful it's going to complicate my life for the next couple of weeks. I won't be coding, I can do some code reviews and I've been collaborating with a colleague on a big backport exercise which we can do together over zoom.

(update: iOS dictation suddenly went into this weird mode there where it kept re-typing and then going back and re-typing the same sentence again and again until I tapped the stop dictating button. One good feature of Amazon Alexa as you can look at the voice history on the application and mark up which ones were in fact completely wrong. I don't see any mechanism of doing that with any of the other tools – and without that there is no way for the system to actually train on what is the usability experience of individuals. Yes you may be able to rely on deep aggregate data of all your users, but without the feedback loop to say "this sentence is wrong", I don't see how you can actually improve the experience -or even assess how well your product is actually working.)

To close then: it's a pretty painful end to 2021 but I'm looking forward to 2022.

(Meanwhile I have to try and get any of these speech recognition disasters to work. Apple Mac off-line dictationOn a laptop with sticky keys enabledIs my goal.This is the one I'm using right nowThe one which isn't putting spaces between sentencesAnd assumes any pause for more than a few seconds constitutesAn end of the sentence.Like I saidNo product manager of any these products should be proud of what they've achievedIn terms of accessibility.)

2021-02-27

offline

I've been off this blog for a while. Some RSI problems, amongst other reasons. I'll try and dictate things

2018-12-20

Isolation is not participation

First

I speak only for myself as an individual, not representing any current or previous employer, ASF ...etc.
I'm not going anywhere near business aspects; out of scope.
I am only looking at Apache Hadoop, which is of course the foundation of Amazon EMR. Which also means: if someone says "yes but projects X, Y & Z, ..." my response is "that's nice, but coming back to the project under discussion, ..."
lots of people I know & respect work for Amazon. I am very much looking at the actions of the company, not the individuals.
And I'm not making any suggestions about what people should do, only arguing that the current stance is harmful to everyone.
I saw last week that EMR now has a reimplementation of the S3A committers, without any credit whatsoever for something I consider profound. This means I'm probably a bit sensitive right now. I waited a couple of days before finishing this post,

With that out the way:-

As I type this a nearby terminal window runs MiniMR jobs against a csv.gz file listing AWS landsat photos, stored somewhere in a bucket.

The tests run on a macbook, a distant descendant of BSD linux, Mach Kernel and its incomplete open source sibling, Darwin. Much of the dev tooling I use is all open source, downloaded via homebrew. The network connection is via a router running DD-WRT.

That Landsat file, s3a://landsat-pds/scene_list.gz, is arguably the single main contribution from the AWS infra for that Hadoop testing.

It's a few hundred MB of free to use data, so I've used it for IO seek/read performance tests, spark dataframe queries, and now, SQL statements direct to the storage infrastructure. Those test are also where I get to explore the new features of the java language, LambdaTestUtils, which is my lifting of what I consider to be the best bits of scalatest. Now I'm adding async IO operations to the Hadoop FileSystem/FileContext classes, and in the tests I'm learning about the java 8+ completable future stuff, how to get them to run IOException-raising code (TL;DR: it hurts)

While I wait for my tests to finish, I see, there's a lot of online discussion about could providers and open source projects, especially post AWS re:Invent (re:Package?), so I'd thought I'd join in.

Of all the bits of recent writing on the topic, one I really like is Roman's, which focuses a lot on community over code.

That is a key thing: open source development is a community. And members of that community can participate by

writing code
reviewing code
writing docs
reviewing docs
testing releases, nightly builds
helping other people with their problems
helping other projects who are having problems with your project's code.
helping other projects take ideas from your code and make use of it
filing bug reports
reviewing, commenting, on, maybe even fixing bug reports.
turning up a conferences, talking about what you are doing, sharing
listening. Not just to people who pay you money, but people who want to use the stuff you've written.
helping build that community by encouraging the participation of others, nurturing their contributions along, trying to get them to embrace your code and testing philosophy, etc.

There are more, but those are some of the key ones.

A key, recurrent theme is that community, where you can contribute in many ways, but you do have to be proactive to build that community. And the best ASF projects are ones which have a broad set of contributors

Take for example, the grand Java 9, 10, 11 project: [HADOOP-11123, HADOOP-11423, HADOOP-15338 ]. That's an epic of suffering, primarily taken on by Akira Ajisaka, and Takanobu Asanuma at Yahoo! Japan, and a few other brave people. This isn't some grand "shout about this at keynotes" kind of feature, but its a critical contribution by people who rely on Hadoop and have a pressing need "Run on Java 9+", and are prepared to put in the effort. I watch their JIRAs with awe.

That's a community effort, driven by users with needs.

Another interesting bit of work: Multipart Upload from HDFS to other block stores. I've been the specification police there; my reaction to "adding fundamental APIs without strict specification and tests" was predictable, so I don't know why they tried to get it past me. Who did that work? Ewan Higgs at Western Digital did a lot -they can see real benefit for their enterprise object store. Virajith Jalaparti at Microsoft, People who want HDFS to be able to use their storage systems for the block store. And there's another side-effect: that mulitpart upload API essentially provides a standard API for multipart-upload based committers. For the S3A committers we added our own private API to S3A FS "WriteOperationHelper"; this will give it to every FS which supports it. And you can do a version of DistCP which writes blocks to the store in parallel from across the filesystem...a very high performance blockstore-to-block-or-object store copy mechanism.

This work is all being done by people who sell storage in one form or another, who see value in the feature, and are putting in the effort to develop it in the open, encourage participation from others, and deliver something independent of underlying storage

This bit of work highlights something else: that Hadoop FS API is broadly used way beyond HDFS, and we need to evolve it to deal with things HDFS offers but keeps hidden, but also for object stores, whose characteristics involve:

very expensive cost of seek(), especially given ORC and Parquet know their read plan way beyond the next read. Fix: HADOOP-11867: Vectorized Read Operations,
very expensive cost of mimicking hierarchical directory trees, treewalking is not ideal,
slow to open a file, as even the existence check can take hundreds of milliseconds.
Interesting new failure modes. Throttling/503 responses if you put too much load on a shard of the store, for example. Which can surface anywhere.
Variable rename performance O(data) for mimicked S3 rename, O(files) for GCS, O(1) with some occasional pauses on Azure.

There are big challenges here and it goes all the way through the system. There's no point adding a feature in a store if the APIs used to access it don't pick it up; there's no point changing an API if the applications people use don't adopt it.

Which is why input from the people who spent time building object stores and hooking their application is so important. That includes:

Code
Bug reports
Insight from their experiences
Information about the problems they've had
Problems their users are having

And that's also why the decision of of the EMR team to isolate themselves from the OSS development holds us back.

We end up duplicating effort, like S3Guard, which is the ASF equivalent of EMR consistent view. The fact that EMR shipped with a feature long before us, could be viewed as their benefit of having a proprietary S3 connector. But S3Guard, like the EMR team, models its work on Netflix S3mper. That's code which one of the EMR team's largest customers wrote, code which Netflix had to retrofit onto the EMR closed-source filesystem using AspectJ.

And that time-to-market edge? Well, is not so obvious any more

The EMR S3-optimised committer shipped in November 2018.
Its open source predecessor, the S3A committers, shipped in Hadoop 3.1, March 2018. That's over 7-8 months ahead of the EMR implementation.
And it shipped in HDP-3.0 in June 2018. That's 5 months ahead.

I'm really happy with that committer, first bit of CS-hard code I've done for a long time, got me into the depths of how committers really work, got an unpublished paper "A zero Rename Committer" from it. And, in writing the TLA+ spec of the object store I used on the way to convincing myself things worked, I was corrected by Lamport himself.

Much of that commit code was written by myself, but it depended utterly on some insights from Thomas Demoor of WDC, It also contains a large donation of code from Netflix, their S3 committer. They'd bypassed emrfs completely and were using the S3A client direct: we took this, hooked it up to what we'd already started to write, incorporated their mockito tests —and now their code is the specific committer I recommend. Because Netflix were using it and it worked.

A heavy user of AWS S3 wanting to fix their problem, sharing the code, having it pulled into the core code so that anyone using the OSS releases gets the benefit of their code *and their testing*.

We were able to pick that code up because Netflix had gone around emrfs and were writing things themselves. That is: they had to bypass the EMR team. And once they'd one that, we could take it, integrate it and ship it eight months before that EMR team did. With proofs of correctness(-ish).

Which is why I don't believe that isolationism is good for whoever opts out of the community. Nor, by the evidence, is is good for their customers.

I don't even think it helps the EMR team's colleagues with their own support calls. Because really, if you aren't active in the project, those colleagues end up depending on the goodwill of others.

(photo: building in central Havana. There'll be a starbucks there eventually)

2018-10-05

Java's use of Checked Exceptions cripples lambda-expressions

I like lambda-expressions. They have an elegance to them which, when I put into my code along with comments using the term "iff", probably marks me out as a Computer Scientist; the way people who studied Latin drop random phrases into sentences to communicate more precisely with others who did the same. Here, rather than use phases like "sue generis", I can drop in obscure references to Church's work, allude to "The Halting Problem" and say "tuple" whenever "pair" wouldn't be elitist enough.

Really though, lambda-expressions are nice because they are a way to pass around snippets of code to use elsewhere

I've mostly used this in tests, with LambaTestUtils.intercept() being the code we've built up to use them, something clearly based on ScalaTest's work of the same name.

protected void verifyRestrictedPermissions(final S3AFileSystem delegatedFS)
    throws Exception {
  intercept(AccessDeniedException.class, 
      () -> readLandsat(delegatedFS));
}

I'm also working on wiring up the UserGroupInformation.doAs() call to l-expressions, so we don't have to faff around creating over-complex PrivilegedAction subclasses, instead go bobUser.do(() -> fs.getUsername()). I've not done that yet, but have the stuff in my tests to explore it: doAs(bobUser, () -> fs.getUsername()).

Java-8 has embraced this, with its streams API, Optional class, etc. I should be able to do the same elegant code in Java 8 that you can do in Scala, such as on an Optional<UserGroupInformation>; instance —no more need to worry about null pointers!

Optional<Credentials> maybeCreds = maybeBobUser.map.doAs( (b) -> b.getCredentials())

And I can the same on those credentals

List<TokenIdentifier> ids = maybeCreds.map(::getAllTokens).stream()
    .map(::decodeTokenIdentifier)
    .getOrElse(new LinkedList<>()).stream()

Except, well, I can't. Because of checked exceptions. That, Token::decodeTokenIdentifier method can raise IOException instances whenever there's a problem decoding the byte array which contains the token identifier (it can also return null for other issues; see HADOOP-15808).

All Hadoop API calls which do some kind of network or IO operation declare they throw an IOException when things fail. It's consistent, it works fairly well. Sometimes interactions with underlying libraries (AWS SDK, Azure SDK) we catch & map, but we also do other error translation there too, then feed that into retry logic and things even out. When you call getFileStatus() against s3a: or abfs:// you can be confident that if its not there you'll get a FileNotFoundException; if there was some connectivity issue our code will have retried, provided it wasn't something unrecoverable like a DNS/Routing problem, where you'll get a NoRouteToHostExcepotion in your stack traces.

Checked exceptions are everywhere in the Hadoop code.

And the Java Streams API can't work with that. All the operations on a stream don't declare that they raise exceptions, so none of the lambda-expressions you can call on them may either. I could jump through hoops and catch & convert them into some RuntimeException —but then what? All the code which is calling mine expects failures to come as IOExceptions, expect those FileNotFoundExceptions, etc. We cannot make serious use of the new APIs in our codebase.

Now, the Oracle team could just have declared that the new map() method raised Exception or similar, but then it'd have been unusable in those methods which don't declare that they throw exceptions, or those which say, throw IOExceptions.

There's no obvious solution to this with those standard Java classes, leaving me the options of (a) not using them or (b) writing my own -which something I've been doing in places. I shouldn't have to do that, all it does is create maintenance pain and doesn't glue together with those standard libraries.

I don't have a choice. And neither does anyone else using Java. Scala doesn't have this problem as exceptions aren't checked. Groovy doesn't have this problem as exceptions aren't checked. C# doesn't have this problem as exceptions aren't checked. Java, however, is now trapped by some design decisions made twenty+ years ago which seemed a good idea at the time.

Is there anything Oracle can do now? I don't know. You could change the compiler to say "all exceptions are unchecked" and see what happens. I suspect a lot of code will break. And because it'll be on the failure paths where problems surface, it'd be hard to get that test coverage to be sure that failures are handled properly. Even so, I can imagine that happening, otherwise, even as the language tries to "stay modern", it's crippled.

2018-04-02

Computer Architecture and Software Security

There's a new paper covering another speculative excuation-based attack on system secrets, BranchScope.

This one relies on the fact that for branch prediction to be effective, two bits are generally allocated to it, strongly & weakly taken and strongly & weakly not taken. The prediction state of a branch is based on the value in BranchHistoryTable[hash(address)]) and used to choose the speculation; if it was wrong it is moved from strongly -> weakly, and from weakly to opposite. Similarly, in weakly taken/non taken, if the prediction was taken, then its moves to strong.

Why so complex? Because we loop all the time

for (int i = 0; i < 1000) {
  doSomething(i);
}

Which probably gets translated into some assembly code (random CPU language I just made up)

    MOV  r1, 0
L1: CMP r1, 999
    JGT end
    JSR DoSomething
    ADD r1, 1
    JMP  L1
    ... continue

For 1000 times in that loop. the branch is taken, then once, at the end of the loop, it's not taken. The first time it's encountered, the CPU won't know what to do, it will just guess one of them and have a 50% chance of being wrong (see below). After that first iteration though it'll guess right, until the final test fails and the loop is exited. If that loop is itself called repeatedly, the fact that final iteration was mispredicted shouldn't lose the fact that the rest of the loop was predicted repeatedly. Hence, two bits.

As Hennessey and Patterson write in Computer Architecture, a quantitive approach (v4, p89), "the importance of branch prediction has increased". With deeper pipelines and the mismatch of CPU speed and memory, guessing right matters.

There isn't enough space in the Branch History Table to store 2 bits of history for every single branch in a system, so instead there'll be some smaller table and some function to take the full address and map it to an offset in that table. According to [Pan92], 4096 to 8192 entries is not that far off "an infinite" table. All that's left is the transform from program counter to BHT entry, which for 32 bit aligned opcodes something as simple as (PC >> 4) & 8191.

But the table is not infinite, there will be clashes: if something else is using the same entry in the BHT, then your branch may be predicted according to its history.

The new attack then simply works out the taken/not taken state of the target branch by seeing how your own code, whose addresses are designed to conflict, is predicted. That's all. And given that ability to predict branch direction, using it to reach conclusions about the state of the system.

Along with caching, branch prediction is the key way in which modern CPUs speed things up. And it does. But it's the clash between your entries in the cache and BHT and that of the target routine which is leaking information: how long it takes to read things, whether a branch is predicted or not. The very act of speeding up code is what leaks secrets.

"Modern" CPU Microarchitecture is in trouble here. We've put decades of work into caching, speculation, branch prediction, and now they all turn out to expose information. We built for speed, at what turns out to be the cost of secrecy. And in cloud environments where you cannot stop malicious code running on the same CPU, that means your secrets are not safe.

What can we do?

Maybe another microcode patch is possible: when switching from usermode to OS mode then the BHT is flushed. But that will cripple performancve in any loop which invokes system code in it. Or you somehow isolate BHT entries for different virtual memory spaces. Probably the best long term, but I'll leave it to others to work out how to implement.

What's worrying is the fact that new exploits are appearing so soon after Meltdown and Spectre. Security experts are now looking at all of the speculative execution bits of modern CPUs and thinking "that's interesting..."; more exploits are inevitable. And again, systems, especially cloud infrastructures, will be left struggling to catch up.

Cloud infrastructures are probably going to have to pin every VM to a dedicated CPU, with the hypervisor on its own part. That will limit secret exfiltration to the VM OS and anything else running on the core (the paper looks at the intel SGX "secure" zone and showed how it can be targeted). It'll be the smaller VMs at risk here, and potentially containerized stuff: you'd want all containers on a single core to be "yours".

What about single-core systems running a mix of trusted and trusted code (your phone, your web browser)? That's going to be hard. You can't dedicate one x86 core per browser tab.

Longer term: we're going to have to go through every bit of modern CPU architecture from a security perspective and say "is this safe?" And no doubt conclude, any speedup mechanism which relies on the history of previous work is insecure, if that history includes the actions taken (or speculatively taken) by sensitive applications.

Which is bad news for the majority of today's high end CPUs, especially those ones trying to keep the x86 instruction set alive. Those are the parts which have had so much effort invested into getting fractional improvements in caching, branch prediction, speculation and pipeline efficiency, and so have gotten incredibly complex. That's where the big vulnerabilities live.

This may push us back towards "underperformant but highly parallel" massivley multicore systems. Little/no speculation, isolating user space code into their own processes.

The most recent example of this is/was the Sun Niagara CPU line, which started off with a pool of early-90s era SPARC CPUs without fancy branch prediction...intead they had 4 set of state to cover the entire execution state of four different threads, scheduling work between them. Memory access? Stall that thread, schedule another. Branch? Don't predict, just wait and see, and add other thread opcodes to the pipeline.

There's still going to be security issues there (cache shared across the many cores, the actions of one thread can be implicitly observed by others in their execution times). And it seemly does speculate memory loads if there was no other work to schedule.

What's equally interesting is that the system is so power efficient. Speculative execution and branch prediction (a) requires lots of gates, renamed registeres, branch history tables and the like —every missed prediction or branch is energy wasted. Compare that to an Itanium part, where you almost need to phone up your electricity supplier for permission to power one up.

The Niagara 2 part pushed it ahead further to a level that is impressive to read. At the same time, you can see a great engineering team struggling with a fab process behind what Intel could do, Sun trying to fight the x86 line, and, well, losing.

Where are the parts now? Oracle's M8 CPU PDF touts its Out Of Order execution —speculative execution—, and data/instruction prefetch. I fear it's now got the same weaknesses of everything else. Apparently the java 8 streams API gets bonus speedup, which reminds me to post something criticising Java checked execution for making that API unusable for the throws IOException Hadoop codebase. As for the virtualization support, again, you'd need to think about pinning to a CPU. There's also that $L1-$L3 cache hit/miss problem: something speculating in one CPU could evict cached data observable to others, unless speculative memory fetches weren't a feature of the part.

They look nice-but-pricey servers; if you are paying the Oracle RDBMs tax the all-in-one price might mitigate that. Overall though, with a focus on many fast-but-dim parts over a smaller number of "throw Si at maximum single thread" architecture of recent x86 designs may provide opportunities for future designs to be more resistant to attacks related to speculative execution. I also think I'd like to see their performance numbers running Apache Spark 2.3 with one executor per thread and lots of RAM.

Update April 3 2018: I see within hours of this posting rumour start that Apple is looking at ARM parts for macbooks in 2020+. Not a coincidence! Actually it is, but because the ARM parts are simpler, they may be less exposed to specex-based attacks, even though Meltdown did affect those implementations which did speculative memory fetches. I think the Niagara architecture has more potential, but it probably works best in massively-multithreaded server side systems, not laptops where latency is the performance metric, not throughput.

[Photo: my 2008 Fizik Gobi saddle snapped one of its Titanium rails last week. Made it home in the rain, but a sign that after a decade, parts just wear out.]

2018-01-29

Advanced Deanonymization through Strava

Strava is getting some bad press, because its heatmap can be used to infer the existence and floorplan of various US/UK military and govt sites.

I covered this briefly in my Berlin Buzzwords 2016 Household INFOSEC talk , though not into that much detail about what's leaked, what a Garmin GPS tracker is vulnerable to (Not: classic XML entity/XInclude attacks, but a malicious site could serve up a subverted GPS map that told me the local motorway was safe to cycle on).

Here are some things Strava may reveal

Whether you run, swim, ski or cycle.
If you tell it, what bicycles you have.
Who you go out on a run or ride with
When you are away from your house
Where you commute to, and when
Your fitness, and whether it is getting better or worse.
When you travel, what TZ, etc.

How to lock down your account?

I only try to defend against drive-by attacks, not nation states or indeed, anyone competent who knows who I am. For Strava then, my goal is: do not share information about where my bicycles are kept, nor those of my friends. I also like to not share too much about the bikes themselves. This all comes secondary to making sure that nobody follows me back from a ride over the Clifton Suspension Bridge (standard attack: wait at the suspension bridge, cycle after them. Standard defence: go through the clifton backroads, see if you are being followed). And I try to make sure all our bikes are locked up, doors locked etc. The last time one got stolen was due to a process failure there (unlocked door) and so the defences fell to some random drug addict rather than anyone using Strava. There's a moral there, but it's still good to lock down your data against tomorrow's Strava attacks, not just today's. My goal: keep my data secure enough to be safe from myself.

I don't use my real name. You can use a single letter as a surname, an "!", or an emoji.
And I've made sure that none of the people I ride regularly do so either
I have a private area around my house, and those of friends.
All my bikes have boring names "The Commuter", not something declaring value.
I have managed fairly successfully to stay of the KoM charts, apart from this climb which I regret doing on so many levels.

For a long time I didn't actually turn the bike computer on until I'd got to the end of the road. I've got complacent there. Even though Strava strips the traces from the private zone when publishing, it does appear to declare the ride distance as the full distance. Given enough rides of mine, you can infer the radius of that privacy zone (fix? Have two overlapping circles?), and the distance on road from the cutoff points to where my house is (overlapping circles won't fix that). You'd need to strip out the start/stop points before uploading to strava (hard) or go back to only switching on recording once you were a randomish distance from your setoff point.

I haven't opted out of the Strava Heatmap, as I don't go anywhere that people care about. That said, there's always concerns in our local MTB groups that Strava leaks the non-official trails to those people who view stopping MTB riders as their duty. A source of controversy.

Now, how would I go for someone else's strava secrets?

You can assume that anyone who scores high in a mountain bike trail is going to have an MTB worth stealing, same for long road climbs.

Ride IDs appear sequential, so you could harvest a days' worth and search.
Join the same cycling club as my targets, see if they publish rides. Fixes: don't join open clubs, ever, and verify members of closed clubs.
Strava KoM chart leakage. Even if you make your rides private, if you get on top 10 riders for that day or whatever, you become visible.

The fact that you can infer nation-state secrets is an interesting escalation. Currently it's the heatmap which is getting the bad press, which is part of the dataset that Strava offer commercially to councils. FWIW, the selection bias on Strava data (male roadies or mountain bikers) means that its not that good. If someone bought our local data, they'd infer that muddy wood trails with trees and rocks are what the city needs. Which is true, but it doesn't address the lack of any safe way to cross the city.

What is interesting about the heat map, and not picked up on yet, is that you can potentially deanonymize people from it.

First, find somewhere sensitive, like say, The UK Faslane Nuclear Submarine Base. Observe the hot areas, like how people run in rectangles in the middle.

Now, go to MapMyRide and log in. Then head over to create a new route using the satellite data

Download the GPX file. This contains the Lat/Long values of the route

If you try to upload it to strava, it'll reject it as there's no timing data. So add it, using some from any real GPX trace as a reference point. Doesn't have to be valid time, just make it slow enough that Strava doesn't think you are driving and tell you off for cheating.

Upload the file as a run, creating a new activity

The next step is the devious one. "Create a segment", and turn part/all of the route into a Strava segment.

Once strava has gone through its records, you'll be able to see the overall top 10 runners per gender/age group, when they ran, it who they ran with. And, if their profile isn't locked down enough: which other military bases they've been for runs on.

I have no idea who has done this run; whether there'll be any segment matches at all. If not, maybe the trace isn't close enough to the real world traces, everyone runs round clockwise, or, hopefully, people are smart enough to mark the area as a private. I'll leave strava up overnight to see what it shows, then delete the segment and run.

Finally, Berlin Buzzwords CFP is open, still offering to help with draft proposals. We can now say it's the place where Strava-based infosec issues were covered 2 years before it was more widely known.

Update 2018-01-29T21:13. I've removed the segment.

Some people's names were appearing there, showing that, yes, you can bootstrap from a heatmap to identification of individual people who have run the same route.

There's no need to blame the people here, so I've pulled the segment to preserve their anonymity. But as anyone else can do it, they should still mark all govt. locations where they train as private areas, so getting included from the heatmap and strava segments.

I don't know what Strava will do long term, but to stop it reoccurring, they'll need to have a way to mark an area as "private area for all users". Doable. Then go to various governments and say "Give us a list of secret sites you don't want us to cover". Which, unless the governments include random areas like mountain ranges in mid wales, is an interesting list of its own.

Update 2018-01-30T16:01 to clarify on marking routes private

All ride/runs marked as "private" don't appear in the leader boards
All ride/runs marked as "don't show in leader boards" don't appear
Nor do any activities in a privacy zone make it onto a segment which starts/ends there
But: "enhanced privacy mode" activities do. That is: even you can't see an individuals's activities off their profile, you can see the rides off the leaderboard.

Update 2018-01-31T00:30 Hacker News coverage

I have made Hacker news. Achievement Unlocked!

Apparently

This is neither advanced nor denanonymization (sic).

They basically pluck an interesting route from the hotmap (as per other people's recent discovery), pretend that they have also run/biked this route and Strava will show them names of others who run/biked the same way. That's clever, but that's not "advanced" by any means.

It's also not a deanonymization as there's really no option in Strava for public _anonymous_ sharing to begin with.

1. Thanks for pointing out the typo. Fixed.

2. It's bootstrapping from nominally anon heatmap data to identifying the participants of the route. And unless people use nicknames (only 2 of the 16 in the segment above) did, then you reveal your real name. And as it shows the entire route when you click through the timestamp, you get to see where they started/finished, who if anyone they went with, etc, etc. You may not their real name, but you know a lot about them.

3. "It''s not advanced". Actually, what Strava do behind the scenes is pretty advanced :). They determine which of all recorded routes they match that segment, within 24h. Presumably they have a record of the bounds of every ride, so first select all rides whose bounds completely enclose the segment. Then they have to go through all of them to see if there is a fraction of their trail which matches.. I presume you'd go with the start co-ord and scan the trace to see if any of the waypoints *or inferred bits of the line between two recorded waypoints* is in the range of that start marker. If so, carry on along the trace looking for the next waypoint of the segment; giving up if the distance travelled is >> greater than the expected distance. And they do that for all recorded events in past history.

All I did was play around with a web UI showing photographs from orbiting cameras, adjacent to a map of the world with humanities' activities inferred by published traces of how phones, watches and bike computers calculated their positions from a set of atomic clocks, uploaded over the internet to a queue in Apache Kafka, processed for storage in AWS S3, whose consistency and throttling is the bane of my life and rendered via Apache Kafka, as covered in Strava Engineering. That is impressive work. Some of their analysis code is probably running through lines of code which I authored, and I'm glad to have contributed to something which is so useful, and, for the heatmap, beautiful to look at.

So no, I wasn't the one doing the advanced engineering —but I respect those who did, and pleased to see the work of people I know being used in the app.