tag:blogger.com,1999:blog-39128692768265463042024-03-17T08:27:59.284+00:00Steve LoughranUnknownnoreply@blogger.comBlogger151125tag:blogger.com,1999:blog-3912869276826546304.post-76825239903432135032022-08-02T11:34:00.001+01:002022-08-02T11:43:26.528+01:00Transitive Issues<p> i am not going to discuss anything sensitive which gets discussed in the hadoop security list, but i do want to post my reply to someone giving us a list of those artifacts with known CVEs, either directly or in their own shaded packaging of dependent libraries (jackson, mainly), then effectively demanding an immediate fix of them all</p><p>I my response is essentially<i> "how do we do that without breaking everything downstream to the extent that nobody will upgrade?</i>". Which is not me trying to dismiss their complaint, rather <i>"if anyone has the answer to this problem I would really love to know".</i></p><p>I do regret that failure to get the OSGi support in to hadoop 0.1x; then we could have had that level of isolation. But OSGi does have its own issues, hence a lack of enthusiasm. But would it be worse than the state we have today?</p><p>The message. I am trying to do as much as I can via macos dictation, which invariably requires a review and fixup afterwards. if things are confused in places, it means I didn't review properly. As to who sent the email, that doesn't matter. It's good that transitive dependency issues are viewed a significant concern, bad that there's no obvious solution here apart from "I say we dust off and nuke the site from orbit"</p><p>(Photo: winter mountaineering in the Brecon Beacons, 1996 (?). Wondering how best to ski down the North Face of Pen y Fan)</p>
<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4YLpTUlaY1d6OqSoHwog_mpv5xUocTnoj7nM1KPruQPDVQiDgZfV2toY4fEW5Wy2Ow8VSLRy5qUGBec_O0tg37iGUTugpnvkN7VBl0f_r6vC2cANmJuy_IoWZhaRUdcMyHjmkH1fP1_mB6GGrz1G9RQT4bfUsaRfBU22v_8cVNsmW2PPwdfXzocsT/s4967/1996-188-11-steve-brecon-ski-16.jpg" style="display: block; padding: 1em 0px; text-align: center;"><img alt="" border="0" data-original-height="3220" data-original-width="4967" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4YLpTUlaY1d6OqSoHwog_mpv5xUocTnoj7nM1KPruQPDVQiDgZfV2toY4fEW5Wy2Ow8VSLRy5qUGBec_O0tg37iGUTugpnvkN7VBl0f_r6vC2cANmJuy_IoWZhaRUdcMyHjmkH1fP1_mB6GGrz1G9RQT4bfUsaRfBU22v_8cVNsmW2PPwdfXzocsT/s400/1996-188-11-steve-brecon-ski-16.jpg" width="400" /></a></div>
<p>Thank you for this list.</p><p>Except in the special case of "our own CVEs", all hadoop development is in public, including issue tracking, which can be searched under <a href="https://issues.apache.org/jira/">https://issues.apache.org/jira/</a></p><p>I recommend you search for upgrades of hadoop, hdfs, mapreduce and yarn, identify the dependencies you are worried about, follow the JIRA issues, and, ideally, help get them in by testing, if not actually contributing the code. If there are no relevant JIRAs, and please create them,</p><p>I have just made an release candidate for hadoop 3.3.4, for which, I have attached the announcement. please look at the changelog to see what has changed.</p><p>We are not upgrading everything in that list. There is a fundamental reason for this. Many of these upgrades are not compatible. While we can modify the hadoop code itself to support those changes, it means a release has become transitively incompatible. That is: even if we did everything we can to make sure our code does not break anything it is still going to break things because of those dependencies I. And as a result people aren't going to upgrade.</p><p>Take one example: <a href="https://issues.apache.org/jira/browse/HADOOP-13386">HADOOP-13386</a> <i>Upgrade Avro to 1.9.2. </i></p><p>This is marked as an incompatible update, "Java classes generated from previous versions of avro will need to be recompiled". If we ship that all your applications are going to break. As well everyone else's.</p><p>jersey updates are another source of constant pain, as the update to v2 breaks all v1 apps, and the two artifacts don't coexist. We had to fix that by using a custom release of jersey which doesn't use jackson.</p><p><a href="https://issues.apache.org/jira/browse/HADOOP-15983">HADOOP-15983</a> <i>Use jersey-json that is built to use jackson2</i></p><p><b>So what do we do?</b> </p><p>We upgrade everything and issue an incompatible release? Because if we do that we know that many applications will not upgrade and we will end up having to maintain the old version anyway. I'm 100% confident that this is true because we still have to do releases of Hadoop 2 with fixes for our own CVEs. </p><p>Or, do we try and safely upgrade everything we can and work with the downstream projects to help them upgrade their versions of Apache Hadoop so at least the attack surface is reduced?</p><p>This is not hypothetical. If I look at two pieces of work I have been involved in recently, or at least tracking.</p><p><a href="https://issues.apache.org/jira/browse/PARQUET-2158">PARQUET-2158</a>. <i>Upgrade Hadoop dependency to version 3.2.0. </i></p><p>That moves parquet's own dependency from hadoop 2.10 to 3.2.0, so it will actually compile and run against them. People will be able to run it against 3.3.4 too... but at least this way we have set the bare minimum to being a branch which has security fixes on.</p><p><a href="https://issues.apache.org/jira/browse/HIVE-24484">HIVE-24484</a>. <i>Upgrade Hadoop to 3.3.1 And Tez to 0.10.2</i></p><p>This is an example of a team doing a major update; again it helps bring them more up-to-date with all their dependences as well as our own CVEs. From the github pull request you can see how things break, both from our own code (generally unintentionally) and from changes in those transitive dependencies. As a result of those breakages hive and tez have held back a long time.</p><p>One of the patches which is in 3.3.4 is intended to help that team</p><p><a href="https://issues.apache.org/jira/browse/HADOOP-18332">HADOOP-18332</a>. <i>Remove rs-api dependency by downgrading jackson to 2.12.7.</i></p><p>This is where we downgraded jackson from the 2.13.2.2 version of Hadoop 3.3.3 to version 2.12.7. This is still up to date with jackson CVEs, but by downgrading we can exclude its transitive dependency on the javax.ws.rs-api library, so Tez can upgrade, thus Hive. Once Hive works against Hadoop 3.3.x, we can get Apache Iceberg onto that version as well. But if the release was incompatible in ways that they considered a blocker, that wouldn't happen.</p><p>It really is a losing battle. Given your obvious concerns in this area I would love to have your suggestions as to how the entire Java software ecosystem –for that is what it is –can address the inherent conflict between the need to maintain the entire transitive set of dependencies for security reasons</p><p>A key challenge is the fact that often these update breaks things two away -a detail you often do not discover- until you ship. The only mitigation which has evolved is shading, having your own private copy of the binaries. Which as you note, makes it impossible for downstream projects to upgrade themselves.</p><p><b>What can you and your employers do to help? </b></p><p>All open source projects depend on the contributions of developers and users. Anything your company's engineering teams can do to help here will be very welcome. At the very least know that you have three days to qualify that 3.3.4 release to make sure that it does not break your deployed systems. If it does work, you should update all production system ASAP. If it turns out there is an incompatibility during this RC face we will hold the build and do our best to address. If you discover an problem after thursday, then it will not be addressed until the next release which you cannot expect to see until September, October or later. You can still help then by providing engineering resources to help validate that release. If you have any continuous integration tooling set up: check out and build the source tree and then try to compile and test your own products against the builds of hadoop and any other parts of the Apache Open Source and Big Data stack on which you depend.</p><p>To conclude then, I'd like to welcome you to participating in the eternal challenge of trying to keep those libraries up to date. Please join in. I would note that we are also looking for people with JavaScript skills as the yarn UI needs work and that is completely beyond my level of expertise.</p><p>If you aren't able to do this and yet you still require all dependencies to be up-to-date, I'm going to suggest you build and test your own software stack using Hadoop 3.4.0 as part of it. You would of course need to start with up-to-date versions of Jersey, Jackson, google guava, Amazon AWS and the like before you even get that far. However, the experience you get in trying to make this all work will again be highly beneficial to everyone.</p><p><br /></p><p>Thanks,</p><p><br /></p><p>Steve Loughran.</p><p><br /></p><p>-----</p><p><br /></p><p>[VOTE] Release Apache Hadoop 3.3.4</p><p><br /></p><p>I have put together a release candidate (RC1) for Hadoop 3.3.4</p><p><br /></p><p>The RC is available at:</p><p>https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.4-RC1/</p><p><br /></p><p>The git tag is release-3.3.4-RC1, commit a585a73c3e0</p><p><br /></p><p>The maven artifacts are staged at</p><p>https://repository.apache.org/content/repositories/orgapachehadoop-1358/</p><p><br /></p><p>You can find my public key at:</p><p>https://dist.apache.org/repos/dist/release/hadoop/common/KEYS</p><p><br /></p><p>Change log</p><p>https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.4-RC1/CHANGELOG.md</p><p><br /></p><p>Release notes</p><p>https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.4-RC1/RELEASENOTES.md</p><p><br /></p><p>There's a very small number of changes, primarily critical code/packaging</p><p>issues and security fixes.</p><p><br /></p><p>See the release notes for details.</p><p><br /></p><p>Please try the release and vote. The vote will run for 5 days.</p><div><br /></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-68660463260750694872021-11-28T13:15:00.004+00:002021-11-28T13:20:34.400+00:00Achievement unlocked. Collarbones<p> <i>(dictated in four different systems with minor punctuation fixups, hence the confused case of all the words and a bit of a disjointed feel. some of the worst transcription errors have been fixed by one-handed typing, but a lot left in to show the tools' awfulness. Ive also added some Opinions About how bad speech recognition is.)</i></p><p>As of Thursday afternoon.I now possess.4 collarbones. or one collarbone in four pieces if you look at it that way.</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdyhx8xxJ-hEDx-769sNWC-ps2Qo6Gtuvv8Up3ABxrgV1FotpAl2kvBpC4fbXGcsx4SUJzk9Y84wyrXRsPHnpza-iCC58_hQc7wzmd5gW1TufPlKHc1WrheJ2VxpxM4Khj5BySKhGoLaE/s2048/IMG_9462.JPG" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1536" data-original-width="2048" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdyhx8xxJ-hEDx-769sNWC-ps2Qo6Gtuvv8Up3ABxrgV1FotpAl2kvBpC4fbXGcsx4SUJzk9Y84wyrXRsPHnpza-iCC58_hQc7wzmd5gW1TufPlKHc1WrheJ2VxpxM4Khj5BySKhGoLaE/s320/IMG_9462.JPG" width="320" /></a></div><br /><p><br /></p><p>This was not intentional.And for the curious desn't actually hurt, provided I don't actually move it at all. But.I am on the high quality drugs The NHS provided for me. Something with.Codeine in. </p><p>I had set off for a.End of Autumn mountain bike ride on a sunny but cool day.And had made it over the bridge.2.The Ashton Court park.I Have ridden., many, many times. </p><p><i>[Side note.I tried dictating this.ThroughGoogle document speech recognition.Full stop.Period. Period.I think I will give a talk next year on the sheer awfulness of the different speech recognition.Systems built into Mac, Windows and.Google.Office.And how?They let down.Anybody?Who cannot actually use a keyboard?The only one.That is vaguely usable.Is.Mac OS offline voice recognition.HyphenBut that still has many flaws.Including.Its use of proper nouns.And the fact that the product is really.Unmaintained. Everyones R&D budget.Is clearly being spent on online speech recognition.With a focus on phones.And short text messages.Full stop.She's where punctuation and correcting what you have typed.Ah.Not considered important .]</i></p><p>[update: google docs speech recognition has just stopped working. im using my windows laptop as a tablet with the on screen keyboard. I'm trying Windows dictation, or, as the product should be known, "Microsoft Something went wrong -try again in a little while"]</p><p>returning to why i cannot use a mechanical keyboard, strava shows the end of my journey.</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcigC5aRuPRh0eyyofq4EbxVKhTJBDe_S_fHqHYuMS3U3ug7SHC__SfypXoiLKgjr0xzGxXK7O1XVnvrIxux0bdAtepLVnZRlBPvIkFj8UwLgzNhU0FwTyThIZrAJC_kGGuA751nZbT-8/s610/Screen+Shot+2021-11-28+at+13.10.55.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="350" data-original-width="610" height="184" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcigC5aRuPRh0eyyofq4EbxVKhTJBDe_S_fHqHYuMS3U3ug7SHC__SfypXoiLKgjr0xzGxXK7O1XVnvrIxux0bdAtepLVnZRlBPvIkFj8UwLgzNhU0FwTyThIZrAJC_kGGuA751nZbT-8/s320/Screen+Shot+2021-11-28+at+13.10.55.png" width="320" /></a></div><br /><p><br /></p><p> i swung off the road, through the gap in the railings and onto the trail, as i have done many times before except this time,I find myself launched into the air with the bicycle. </p><p>I have no idea what went wrong this wasn't a sideways front wheel slide out, more the kind of forward launch you'd do if you went over a 50cm+ dropoff without keeping your weight back Except that there was no dropoff. Maybe I just wasn't holding on to the bars tightly enough for the transition from tarmac to trail and the front wheel just twisted enough to trigger the Physics Event</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipm2ns6JbmXDkS7ymWlMfhXYRRX69Ons7SATQiSDV2l_tKgC-EMBCPtPUmK8dO16QMi402uIVOiag451eOOVh-DFvAZZ5D5JqyqOFnV3w1CpGR2SkarovERSnsdIJd5iZiaD-wBPB-TtY/s284/Screen+Shot+2021-11-28+at+13.11.30.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="249" data-original-width="284" height="249" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipm2ns6JbmXDkS7ymWlMfhXYRRX69Ons7SATQiSDV2l_tKgC-EMBCPtPUmK8dO16QMi402uIVOiag451eOOVh-DFvAZZ5D5JqyqOFnV3w1CpGR2SkarovERSnsdIJd5iZiaD-wBPB-TtY/s0/Screen+Shot+2021-11-28+at+13.11.30.png" width="284" /></a></div><br /><p><br /></p><p>strava says i was doing 20 kmh, so 72 kg of cyclist, 13 kg of steel hardtail MTB and a few kg of baggage makes for ~. 1400 Joules of kinetic energy. The specific heat capacity of the human body is 3.5 J/g, </p><p>Its a definite design failure of humans that we cannot disperse that excess energy bye converting it into heat. If we could I'd have been 0.005 Kelvins warmer </p><p>I would have waited a few seconds to cool down before heading on my way. Sadly, we don't work like that, the energy has just moved bits of my interior around. </p><p>I sit up while some people nearby run over to make sure I'm OK. It was clearly dramatic. I feel a bit bashed but no pain. However the general rule for mountain bike related crashes is.: sit down and wait for the adrenaline wave to pass and then you can actually assess how injured you are. </p><p>This time while I wait for that wave to pass I do actually feel under my jacket to see what my chest is like and where I normally have a collar bone I can feel some things moving under my skin. This is not good. </p><p>I've never actually broken anything before in my life. I have a far number of dents bruises scars and other bits of damage collected over the years. I haven't been able to run since 2007 on account of tendon damage on my left leg. And the back of my right leg has a set of scars the exact same shape and radius as a chainring. But never breakage. At least here I'm only a couple of miles away from home. I had initially considered just walking back pushing the bike but given the state of those bones that's not gonna happen.</p><p> Phone up my wife who is at home waiting for a replacement dishwasher to be delivered. It's not arrived yet and the status update implies it is at least half an hour away. This gives her enough time to drive over collect me and bring me home. The crash's audience- a couple visiting Bristol- stay around with me until then and help load the bicycle. Then it's home and onto accident and emergency. </p><p>The drive home is fine except every time we go over a bump it hurts. Anyone who knows Bristol will appreciate it's going to hurt quite a lot on account of the roads are slowly dissolving into a state which archaeologists wouldn't consider up to standard of Neolithic hunting trails. I realise I made a mistake here. We have a first aid kit in the car and i should have put my arm in a sling and i should have put my arm in a sling straightaway. not for the drive but because you need to factor in the time you'll be sitting in the hospital. .</p><p><i>(that section, repetitions included but not the maths, was dictated using Microsoft "something went wrong" . I worked out that even though Google docs and visual studio don't take dictation (why not?) Notepad does. So I can dictate into notepad for a while Then copy and paste It into Google Docs. Just like all the other server side NLP-as-a-service product you can only dictate for a short period of time before it makes a beeping sound and stops then you have to press a key to start it again. That's a really great idea isn't it? To add a product to help people with accessibility issues interact with your computer by having to press windows+ H *simultaneously* on the keyboard every 30 seconds to say "no I still don't have the ability to use a keyboard"? what product manager felt that was acceptable?</i></p><p><i>but, it suddenly switched to a "let's go back and overwrite everything you've just typed mode" and I could not get out of it. I do not want to bring up window services manager and start killing things just to see if that makes a difference or reboot the system. After all, if it is the servers that are sending the wrong instructions back to the OS it's not going to make any difference whatsoever.</i></p><p><i>I've switched to an iPad.</i></p><p>i<i>OS speech recognition is another "speech misunderstanding as a service" Product . When I do get round to giving that talk about the mediocrity of online speech recognition services, I will cite Apple products as an example of unconscious class bias in speech recognition based on datasets skewed to your existing customer base.</i></p><p><i>As long as I put on a measured middle-class southern English accent "smug NW3 postcode mode" it seems to understand what i say. Use my default accent, "Excited NW6" and it gets less reliable. I fear for anyone with a strong Glasgow or Liverpool accent trying to get Siri to do anything. </i></p><p>down to A&E. 10 minutes walk; less bumpy than a drive. For Americans, A&E is a bit like ER only less expensive. you don't even get asked for credit card and ID before they let you sit down. </p><p>You also get a broad view of a diverse city with the selection bias that everybody in there has a health issue they consider urgent. I'm sitting one seat away from a Somali woman who alternates between talking on the phone to weeping quietly. She says she doesn't need any help. Nearby a French-speaking father and son await attention; the son's foot does not point the correct way out of a leg. And while I wait the police wheel in someone wearing those white paper oversuits they always showing police dramas when the forensic team are trying to work out how someone died. Trivia: plastic bags they were on their feet have flat soles So they don't leave footprints. Judging by the way the patient's leg is held out they may not deliver much traction. I do wish however that more of the people in the room would wear face masks, and of those that do, it's time they should've learned how they work and that they need to cover the nose. I am glad I had my third booster shot a few weeks ago.</p><p>after an hour sitting in the chair with my arm slumped down by my side I am invited in for assessment by a nurse. Before looking at the shoulder she looks for any other injuries discusses whether I banged my head how does my neck feel can I turn it et cetera et cetera all good. What about your shoulder pain she asks on a scale of one to 10? 2 to 3 as long as I don't move it I reply. she touches the collarbone. As tears spontaneously come out of my eyes I say "that's a bit more. "She agrees I'll be needing an x-ray and sending me off to a different waiting room. </p><p>One thing I've learned from a couple of other visits to the hospital is that it is good to have a bag with things that are useful. Phone charger for example, something to read and epilepsy medicines. This time I've brought a short sleeved shirt which buttons up at the front. Will be needing that soon. the X-ray waiting area is nearly empty. Signs up around the room are addressed to victims of domestic violence and giving them phone numbers to call for help. That's not a good sign of how some adults end up in that part of the hospital. In the kids x-ray section it's all about why you should be more careful on trampolines… though by that time it's a bit late. </p><p>in the x-ray room they help me get my outer jacket off, then they cut off the inner cycling top completely to put it off easily. they did offer to see if we can get it off unscathed but I lack the sentimentality to want to hurt myself quite that much. Then I get to participate in the second major physics event of the day. This time I'm the target of a low-luminosity beam of photons in the 5-10 KeV range while the radiologist runs to a safe distance -as they should. </p><p>back in the main waiting room now wearing a gown. Someone comes in screaming about brickwork in their eyeball. They get priority, While the rest of us make a mental note About the value of safety glasses</p><p>I sit around another half hour before I'm pulled in for my results. <i>"you have broken your collarbone -but it looks like it will heal without any intervention which is good because there is not much intervention we can do</i>". I put my replacement shirt on and I'm given a sling and some quality prescription painkillers. I go home, have some food and something to drink – I was so thirsty but in case I was going in for surgery I hadn't drunk anything since the crash.</p><p>Up into the living room where we pack cushions around me until I'm in a comfortable position. This is pretty much where I'm staying right now; I've been sleeping here too. </p><p>apparently the first few days are the most painful, I am trying to move very carefully and not to use that arm at all. Provided I take the painkillers every few hours and don't use the arm it is mostly okay. I would really like to drink a beer but the prescription forbids it. oh well, sometimes you have to prioritise. </p><p>now what? Well, I don't have any follow-up visits with the hospital arranged. I stuck my x-ray photo up on the orange riders Facebook group and got feedback from all the other people that have done similar things. It splits between <i>" I did that and I was back on my bike within six weeks"</i> and <i>"look at the pins they've stuck in my bone after things didn't get any better after eight weeks. They go beep through an airport metal detector and it hurts in the cold"</i>. currently I'm hoping for the heal on their own outcome.</p><p>The fact that I can't type and that speech recognition across all the various platforms is so awful it's going to complicate my life for the next couple of weeks. I won't be coding, I can do some code reviews and I've been collaborating with a colleague on a big backport exercise which we can do together over zoom. </p><p><i>(update: iOS dictation suddenly went into this weird mode there where it kept re-typing and then going back and re-typing the same sentence again and again until I tapped the stop dictating button. One good feature of Amazon Alexa as you can look at the voice history on the application and mark up which ones were in fact completely wrong. I don't see any mechanism of doing that with any of the other tools – and without that there is no way for the system to actually train on what is the usability experience of individuals. Yes you may be able to rely on deep aggregate data of all your users, but without the feedback loop to say "this sentence is wrong", I don't see how you can actually improve the experience -or even assess how well your product is actually working.)</i></p><p><br /></p><p>To close then: <b>it's a pretty painful end to 2021 but I'm looking forward to 2022. </b></p><p><br /></p><p><i>(Meanwhile I have to try and get any of these speech recognition disasters to work. Apple Mac off-line dictationOn a laptop with sticky keys enabledIs my goal.This is the one I'm using right nowThe one which isn't putting spaces between sentencesAnd assumes any pause for more than a few seconds constitutesAn end of the sentence.Like I saidNo product manager of any these products should be proud of what they've achievedIn terms of accessibility.)</i></p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-81205211873661717572021-02-27T16:14:00.001+00:002021-02-27T16:14:26.713+00:00offline<p> I've been off this blog for a while. Some RSI problems, amongst other reasons. I'll try and dictate things</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-15635096314518542932018-12-20T16:59:00.001+00:002018-12-21T16:04:31.604+00:00Isolation is not participation<div dir="ltr" style="text-align: left;" trbidi="on">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_pRmfp1NH1eTYhoTSpe8bDyGJ21JzcmuPu4tZdc5erj792LZNPWp8j9qjT_lESAgE3s2vjd7a_3DiVWaopdM4HXGBTnwJmNew4qtuFQIPa5Qk3x1-v-c21KNmvHYybO1mKG5yYX9ZHZw/s1600/P1000042.JPG" imageanchor="1"><img border="0" data-original-height="1200" data-original-width="1600" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_pRmfp1NH1eTYhoTSpe8bDyGJ21JzcmuPu4tZdc5erj792LZNPWp8j9qjT_lESAgE3s2vjd7a_3DiVWaopdM4HXGBTnwJmNew4qtuFQIPa5Qk3x1-v-c21KNmvHYybO1mKG5yYX9ZHZw/s400/P1000042.JPG" width="400" /></a><br />
<br />
<i>First</i><br />
<ol style="text-align: left;">
<li>I speak only for myself as an individual, not representing any current or previous employer, ASF ...etc.</li>
<li>I'm not going anywhere near business aspects; out of scope. </li>
<li>I am only looking at Apache Hadoop, which is of course the foundation of Amazon EMR. Which also means: if someone says "yes but projects X, Y & Z, ..." my response is "that's nice, but coming back to the project under discussion, ..."</li>
<li>lots of people I know & respect work for Amazon. I am very much looking at the actions of the company, not the individuals.</li>
<li>And I'm not making any suggestions about what people should do, only arguing that the current stance is harmful to everyone. </li>
<li>I saw last week that EMR now has a reimplementation of the S3A committers, without any credit whatsoever for something I consider profound. This means I'm probably a bit sensitive right now. I waited a couple of days before finishing this post,</li>
</ol>
<i>With that out the way:-</i><br />
<br />
As I type this a nearby terminal window <a href="https://github.com/steveloughran/hadoop/blob/filesystem/HADOOP-15229-openfile/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/select/ITestS3SelectMRJob.java">runs MiniMR jobs against a csv.gz file</a> listing AWS landsat photos, stored somewhere in a bucket.<br />
<br />
The tests run on a macbook, a distant descendant of BSD linux, Mach Kernel and its incomplete open source sibling, Darwin. Much of the dev tooling I use is all open source, downloaded via homebrew. The network connection is via a router running <a href="https://www.myopenrouter.com/">DD-WRT</a>.<br />
<br />
<br />
That Landsat file, s3a://landsat-pds/scene_list.gz, is arguably the single main contribution from the AWS infra for that Hadoop testing. <br />
<br />
It's a few hundred MB of free to use data, so I've used it for IO seek/read performance tests, spark dataframe queries, and now, SQL statements direct to the storage infrastructure. Those test are also where I get to explore the new features of the java language, <a href="https://github.com/steveloughran/hadoop/blob/filesystem/HADOOP-15229-openfile/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/LambdaTestUtils.java">LambdaTestUtils</a>, which is my lifting of what I consider to be <a href="http://steveloughran.blogspot.com/2016/09/scalatest-thoughts-and-ideas.html">the best bits of scalatest</a>. Now I'm adding async IO operations to the Hadoop FileSystem/FileContext classes, and in the tests I'm learning about the java 8+ completable future stuff, how to get them to run IOException-raising code (TL;DR: <a href="http://steveloughran.blogspot.com/2018/10/javas-use-of-checked-exceptions.html">it hurts</a>)<br />
<br />
While I wait for my tests to finish, I see, there's a lot of online discussion about could providers and open source projects, especially post AWS re:Invent (re:Package?), so I'd thought I'd join in. <br />
<br />
<br />
Of all the bits of recent writing on the topic, one I really like is Roman's, which focuses a lot on <a href="https://medium.com/@rhatr/is-it-time-for-cloud-native-open-source-db0ad6e695e5">community over code</a>.<br />
<br />
That is a key thing: open source development is a community. And members of that community can participate by<br />
<ol style="text-align: left;">
<li>writing code</li>
<li>reviewing code </li>
<li>writing docs</li>
<li>reviewing docs</li>
<li>testing releases, nightly builds</li>
<li>helping other people with their problems</li>
<li>helping other projects who are having problems with your project's code.</li>
<li>helping other projects take ideas from your code and make use of it </li>
<li>filing bug reports</li>
<li>reviewing, commenting, on, maybe even fixing bug reports. </li>
<li>turning up a conferences, talking about what you are doing, sharing</li>
<li>listening. Not just to people who pay you money, but people who want to use the stuff you've written.</li>
<li>helping build that community by encouraging the participation of others, nurturing their contributions along, trying to get them to embrace your code and testing philosophy, etc.</li>
</ol>
There are more, but those are some of the key ones.<br />
<br />
A key, recurrent theme is that community, where you can contribute in many ways, but you do have to be proactive to build that community. And the best ASF projects are ones which have a broad set of contributors<br />
<br />
Take for example, the grand Java 9, 10, 11 project: [<a href="https://issues.apache.org/jira/browse/HADOOP-11123">HADOOP-11123</a>, <a href="https://issues.apache.org/jira/browse/HADOOP-11423">HADOOP-11423</a>, <a href="https://issues.apache.org/jira/browse/HADOOP-15338">HADOOP-15338</a> ]. That's an epic of suffering, primarily taken on by Akira Ajisaka, and Takanobu Asanuma at Yahoo! Japan, and a few other brave people. This isn't some grand "shout about this at keynotes" kind of feature, but its a critical contribution by people who rely on Hadoop and have a pressing need "Run on Java 9+", and are prepared to put in the effort. I watch their JIRAs with awe.<br />
<br />
That's a community effort, driven by users with needs.<br />
<br />
Another interesting bit of work: <a href="https://issues.apache.org/jira/browse/HDFS-12090">Multipart Upload from HDFS to other block stores</a>. I've been <a href="https://issues.apache.org/jira/browse/HDFS-13713">the specification police there;</a> my reaction to "<i>adding fundamental APIs without strict specification and tests"</i> was predictable, so I don't know why they tried to get it past me. Who did that work? Ewan Higgs at Western Digital did a lot -they can see real benefit for their enterprise object store. Virajith Jalaparti at Microsoft, People who want HDFS to be able to use their storage systems for the block store. And there's another side-effect: that mulitpart upload API essentially provides a standard API for multipart-upload based committers. For the S3A committers we added our own private API to S3A FS "WriteOperationHelper"; this will give it to every FS which supports it. And you can do a version of DistCP which writes blocks to the store in parallel from across the filesystem...a very high performance blockstore-to-block-or-object store copy mechanism.<br />
<br />
This work is all being done by people who sell storage in one form or another, who see value in the feature, and are putting in the effort to develop it in the open, encourage participation from others, and deliver something independent of underlying storage <br />
<br />
This bit of work highlights something else: that Hadoop FS API is broadly used way beyond HDFS, and we need to evolve it to deal with things HDFS offers but keeps hidden, but also for object stores, whose characteristics involve:<br />
<ul style="text-align: left;">
<li>very expensive cost of seek(), especially given ORC and Parquet know their read plan way beyond the next read. Fix: <a href="https://issues.apache.org/jira/browse/HADOOP-11867">HADOOP-11867: Vectorized Read Operations</a>, </li>
<li>very expensive cost of mimicking hierarchical directory trees, treewalking is not ideal,</li>
<li>slow to open a file, as even the existence check can take <a href="http://steveloughran.blogspot.com/2016/12/how-long-does-filesystemexists-take.html">hundreds of milliseconds</a>.</li>
<li>Interesting new failure modes. Throttling/503 responses if you put too much load on a shard of the store, for example. Which <a href="https://issues.apache.org/jira/browse/HADOOP-15209">can surface anywhere</a>.</li>
<li>Variable rename performance O(data) for mimicked S3 rename, O(files) for GCS, O(1) with some occasional pauses on Azure.</li>
</ul>
There are big challenges here and it goes all the way through the system. There's no point adding a feature in a store if the APIs used to access it don't pick it up; there's no point changing an API if the applications people use don't adopt it.<br />
<br />
<div dir="ltr" style="text-align: left;" trbidi="on">
Which is why input from the people who spent time building object stores and hooking their application is so important. That includes:<br />
<ul style="text-align: left;">
<li>Code</li>
<li>Bug reports</li>
<li>Insight from their experiences</li>
<li>Information about the problems they've had</li>
<li>Problems their users are having</li>
</ul>
And that's also why the decision of of the EMR team to isolate themselves from the OSS development holds us back.<br />
<br />
We end up duplicating effort, like <a href="https://hadoop.apache.org/docs/r2.9.2/hadoop-aws/tools/hadoop-aws/s3guard.html">S3Guard</a>, which is the ASF equivalent of <a href="https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-consistent-view.html">EMR consistent view</a>. The fact that EMR shipped with a feature long before us, could be viewed as their benefit of having a proprietary S3 connector. But S3Guard, like the EMR team, models its work on <a href="https://medium.com/netflix-techblog/s3mper-consistency-in-the-cloud-b6a1076aa4f8">Netflix S3mper</a>. That's code which one of the EMR team's largest customers wrote, code which Netflix had to retrofit onto the EMR closed-source filesystem <a href="https://github.com/Netflix/s3mper/blob/master/src/main/java/com/netflix/bdp/s3mper/listing/ConsistentListingAspect.java">using AspectJ</a>.<br />
<br />
And that time-to-market edge? Well, is not so obvious any more <br />
<ol style="text-align: left;">
<li>The <a href="https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-s3-optimized-committer.html">EMR S3-optimised committer </a>shipped in November 2018.</li>
<li>Its open source predecessor, <a href="https://hadoop.apache.org/docs/r3.1.1/hadoop-aws/tools/hadoop-aws/committers.html">the S3A committers</a>, shipped in Hadoop 3.1, March 2018. That's over 7-8 months ahead of the EMR implementation.</li>
<li>And it shipped in HDP-3.0 <a href="https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.1/bk_cloud-data-access/content/ch03s08s01.html">in June 2018</a>. That's 5 months ahead. </li>
</ol>
I'm really happy with that committer, first bit of CS-hard code I've done for a long time, got me into the depths of how committers really work, got an unpublished paper "<a href="https://github.com/steveloughran/zero-rename-committer/releases/download/tag_draft_003/a_zero_rename_committer.pdf">A zero Rename Committer</a>" from it. And, in writing the <a href="https://github.com/steveloughran/formality/releases/download/tag_blobstore_0.3/objectstore.pdf">TLA+ spec of the object store </a>I used on the way to convincing myself things worked, I was corrected by Lamport himself.<br />
<br />
Much of that commit code was written by myself, but it depended utterly on some insights from <a href="https://issues.apache.org/jira/browse/HADOOP-13786?focusedCommentId=15755057&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15755057">Thomas Demoor of WDC</a>, It also contains a large donation of code from Netflix, <a href="https://github.com/rdblue/s3committer">their S3 committer</a>. They'd bypassed emrfs completely and were using the S3A client direct: we took this, hooked it up to what we'd already started to write, incorporated their mockito tests —and now their code is the specific committer I recommend. Because Netflix were using it and it worked.<br />
<br />
<i>A heavy user of AWS S3 wanting to fix their problem, sharing the code, having it pulled into the core code so that anyone using the OSS releases gets the benefit of their code *and their testing*.</i></div>
<div dir="ltr" style="text-align: left;" trbidi="on">
</div>
<div dir="ltr" style="text-align: left;" trbidi="on">
</div>
<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
We were able to pick that code up because Netflix had gone around emrfs and were writing things themselves. That is: they had to bypass the EMR team. And once they'd one that, we could take it, integrate it and ship it eight months before that EMR team did. With proofs of correctness(-ish).<br />
<br />
Which is why I don't believe that isolationism is good for whoever opts out of the community. Nor, by the evidence, is is good for their customers.<br />
<br />
I don't even think it helps the EMR team's colleagues <a href="https://lists.apache.org/thread.html/87eb8e16abad09232da5fbc6999c19c4fba0f16d641c565e62096564@%3Cuser.flink.apache.org%3E">with their own support calls</a>. Because really, if you aren't active in the project, those colleagues end up <a href="https://issues.apache.org/jira/browse/FLINK-11187">depending on the goodwill of others</a>.</div>
<br />
<br />
(photo: building in central Havana. There'll be a starbucks there eventually)</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-30862309029101102512018-10-05T11:06:00.001+01:002018-12-20T13:48:34.830+00:00Java's use of Checked Exceptions cripples lambda-expressions<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
<br />
I like lambda-expressions. They have an elegance to them which, when I put into my code along with comments using the term "iff", probably marks me out as a Computer Scientist; the way people who studied Latin drop random phrases into sentences to communicate more precisely with others who did the same. Here, rather than use phases like "sue generis", I can drop in obscure references to <a href="https://en.wikipedia.org/wiki/Lambda_calculus">Church's work</a>, allude to "The Halting Problem" and say "tuple" whenever "pair" wouldn't be elitist enough. <br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/30058789417/in/dateposted/" title="Jamaica street, September 2018"><img alt="Jamaica street, September 2018" height="254" src="https://farm2.staticflickr.com/1929/30058789417_fda7e6cb1b.jpg" width="500" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
Really though, lambda-expressions are nice because they are a way to pass around snippets of code to use elsewhere<br />
<br />
I've mostly used this in tests, with <code>LambaTestUtils.intercept()</code> being the code we've built up to use them, something clearly based on ScalaTest's work of the same name.<br />
<br />
<pre>protected void verifyRestrictedPermissions(final S3AFileSystem delegatedFS)
throws Exception {
intercept(AccessDeniedException.class,
() -> readLandsat(delegatedFS));
}
</pre>
<br />
I'm also working on wiring up the <code>UserGroupInformation.doAs()</code> call to l-expressions, so we don't have to faff around creating over-complex <code>PrivilegedAction</code> subclasses, instead go <code>bobUser.do(() -> fs.getUsername())</code>. I've not done that yet, but have the stuff in my tests to explore it: <code>doAs(bobUser, () -> fs.getUsername())</code>.<br />
<br />
Java-8 has embraced this, with its streams API, <code>Optional</code> class, etc. I should be able to do the same elegant code in Java 8 that you can do in Scala, such as on an <code>Optional<UserGroupInformation>;</code> instance —no more need to worry about null pointers!<br />
<br />
<pre>Optional<Credentials> maybeCreds = maybeBobUser.map.doAs( (b) -> b.getCredentials())
</pre>
<br />
And I can the same on those credentals<br />
<br />
<pre>List<TokenIdentifier> ids = maybeCreds.map(::getAllTokens).stream()
.map(::decodeTokenIdentifier)
.getOrElse(new LinkedList<>()).stream()
</pre>
<br />
Except, well, I can't. Because of checked exceptions. That, <code>Token::decodeTokenIdentifier</code> method can raise <code>IOException</code> instances whenever there's a problem decoding the byte array which contains the token identifier (it can also return null for other issues; see HADOOP-15808).<br />
<br />
All Hadoop API calls which do some kind of network or IO operation declare they throw an <code>IOException</code> when things fail. It's consistent, it works <i>fairly</i> well. Sometimes interactions with underlying libraries (AWS SDK, Azure SDK) we catch & map, but we also do other error translation there too, then feed that into retry logic and things even out. When you call <code>getFileStatus()</code> against s3a: or abfs:// you can be confident that if its not there you'll get a <code>FileNotFoundException</code>; if there was some connectivity issue our code will have retried, provided it wasn't something unrecoverable like a DNS/Routing problem, where you'll get a <code>NoRouteToHostExcepotion</code> in your stack traces.<br />
<br />
Checked exceptions are everywhere in the Hadoop code.<br />
<br />
And the Java Streams API can't work with that. All the operations on a stream don't declare that they raise exceptions, so none of the lambda-expressions you can call on them may either. I could jump through hoops and catch & convert them into some <code>RuntimeException</code> —but then what? All the code which is calling mine expects failures to come as <code>IOExceptions</code>, expect those <code>FileNotFoundExceptions</code>, etc. We cannot make serious use of the new APIs in our codebase.<br />
<br />
Now, the Oracle team could just have declared that the new <code>map()</code> method raised <code>Exception</code> or similar, but then it'd have been unusable in those methods which don't declare that they throw exceptions, or those which say, throw IOExceptions. <br />
<br />
There's no obvious solution to this with those standard Java classes, leaving me the options of (a) not using them or (b) writing my own -which something I've <a href="https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java#L1321">been doing in places</a>. I shouldn't have to do that, all it does is create maintenance pain and doesn't glue together with those standard libraries.<br />
<br />
I don't have a choice. And neither does anyone else using Java. Scala doesn't have this problem as exceptions aren't checked. Groovy doesn't have this problem as exceptions aren't checked. C# doesn't have this problem as exceptions aren't checked. Java, however, is now trapped by some design decisions made twenty+ years ago which seemed a good idea at the time.<br />
<br />
Is there anything Oracle can do now? I don't know. You could change the compiler to say "all exceptions are unchecked" and see what happens. I suspect a lot of code will break. And because it'll be on the failure paths where problems surface, it'd be hard to get that test coverage to be sure that failures are handled properly. Even so, I can imagine that happening, otherwise, even as the language tries to "stay modern", it's crippled.</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-15680966619484110202018-04-02T20:50:00.001+01:002018-04-03T13:17:05.184+01:00Computer Architecture and Software Security<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/26268449517/" title="Gobi's End"><img alt="Gobi's End" height="240" src="https://farm1.staticflickr.com/811/26268449517_42139f45ee_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
There's a new paper covering another speculative excuation-based attack on system secrets, <a href="http://www.cs.ucr.edu/~nael/pubs/asplos18.pdf">BranchScope</a>.<br />
<br />
This one relies on the fact that for branch prediction to be effective, two bits are generally allocated to it, strongly & weakly taken and strongly & weakly not taken. The prediction state of a branch is based on the value in BranchHistoryTable[hash(address)]) and used to choose the speculation; if it was wrong it is moved from strongly -> weakly, and from weakly to opposite. Similarly, in weakly taken/non taken, if the prediction was taken, then its moves to strong.<br />
<br />
Why so complex? Because we loop all the time<br />
<pre>for (int i = 0; i < 1000) {
doSomething(i);
}
</pre>
<br />
Which probably gets translated into some assembly code (random CPU language I just made up)<br />
<br />
<pre> MOV r1, 0
L1: CMP r1, 999
JGT end
JSR DoSomething
ADD r1, 1
JMP L1
... continue
</pre>
<br />
For 1000 times in that loop. the branch is taken, then once, at the end of the loop, it's not taken. The first time it's encountered, the CPU won't know what to do, it will just guess one of them and have a 50% chance of being wrong (see below). After that first iteration though it'll guess right, until the final test fails and the loop is exited. If that loop is itself called repeatedly, the fact that final iteration was mispredicted shouldn't lose the fact that the rest of the loop was predicted repeatedly. Hence, two bits.<br />
<br />
As Hennessey and Patterson write in Computer Architecture, a quantitive approach (v4, p89), <i>"the importance of branch prediction has increased".</i> With deeper pipelines and the mismatch of CPU speed and memory, guessing right matters.<br />
<br />
There isn't enough space in the Branch History Table to store 2 bits of history for every single branch in a system, so instead there'll be some smaller table and some function to take the full address and map it to an offset in that table. According to [<a href="https://staff.fnwi.uva.nl/s.polstra/aca2016/p76-pan.pdf">Pan92</a>], 4096 to 8192 entries is not that far off "an infinite" table. All that's left is the transform from program counter to BHT entry, which for 32 bit aligned opcodes something as simple as <tt>(PC >> 4) & 8191</tt>.<br />
<br />
But the table is not infinite, there will be clashes: if something else is using the same entry in the BHT, then your branch may be predicted according to its history.<br />
<br />
The new attack then simply works out the taken/not taken state of the target branch by seeing how your own code, whose addresses are designed to conflict, is predicted. That's all. And given that ability to predict branch direction, using it to reach conclusions about the state of the system.<br />
<br />
Along with caching, branch prediction is the key way in which modern CPUs speed things up. And it does. But it's the clash between your entries in the cache and BHT and that of the target routine which is leaking information: how long it takes to read things, whether a branch is predicted or not. The very act of speeding up code is what leaks secrets.<br />
<br />
"Modern" CPU Microarchitecture is in trouble here. We've put decades of work into caching, speculation, branch prediction, and now they all turn out to expose information. We built for speed, at what turns out to be the cost of secrecy. And in cloud environments where you cannot stop malicious code running on the same CPU, that means your secrets are not safe.<br />
<br />
What can we do?<br />
<br />
Maybe another microcode patch is possible: when switching from usermode to OS mode then the BHT is flushed. But that will cripple performancve in any loop which invokes system code in it. Or you somehow isolate BHT entries for different virtual memory spaces. Probably the best long term, but I'll leave it to others to work out how to implement.<br />
<br />
What's worrying is the fact that new exploits are appearing so soon after Meltdown and Spectre. Security experts are now looking at all of the speculative execution bits of modern CPUs and thinking "that's interesting..."; more exploits are inevitable. And again, systems, especially cloud infrastructures, will be left struggling to catch up.<br />
<br />
Cloud infrastructures are probably going to have to pin every VM to a dedicated CPU, with the hypervisor on its own part. That will limit secret exfiltration to the VM OS and anything else running on the core (the paper looks at the intel SGX "secure" zone and showed how it can be targeted). It'll be the smaller VMs at risk here, and potentially containerized stuff: you'd want all containers on a single core to be "yours".<br />
<br />
What about single-core systems running a mix of trusted and trusted code (your phone, your web browser)? That's going to be hard. You can't dedicate one x86 core per browser tab.<br />
<br />
Longer term: we're going to have to go through every bit of modern CPU architecture from a security perspective and say "is this safe?" And no doubt conclude, any speedup mechanism which relies on the history of previous work is insecure, if that history includes the actions taken (or speculatively taken) by sensitive applications.<br />
<br />
Which is bad news for the majority of today's high end CPUs, especially those ones trying to keep the x86 instruction set alive. Those are the parts which have had so much effort invested into getting fractional improvements in caching, branch prediction, speculation and pipeline efficiency, and so have gotten incredibly complex. That's where the big vulnerabilities live.<br />
<br />
This may push us back towards "underperformant but highly parallel" massivley multicore systems. Little/no speculation, isolating user space code into their own processes.<br />
<br />
The most recent example of this is/was the <a href="https://pdfs.semanticscholar.org/17ff/9ea3c1bae17e5b29be5a358a22d27ba64c9e.pdf">Sun Niagara CPU line</a>, which started off with a pool of early-90s era SPARC CPUs without fancy branch prediction...intead they had 4 set of state to cover the entire execution state of four different threads, scheduling work between them. Memory access? Stall that thread, schedule another. Branch? Don't predict, just wait and see, and add other thread opcodes to the pipeline. <br />
<br />
There's still going to be security issues there (cache shared across the many cores, the actions of one thread can be implicitly observed by others in their execution times). And it seemly does speculate memory loads if there was no other work to schedule.<br />
<br />
What's equally interesting is that the system <a href="http://www.oracle.com/technetwork/systems/opensparc/06-sparc-power-effecient-1530403.pdf">is so power efficient</a>. Speculative execution and branch prediction (a) requires lots of gates, renamed registeres, branch history tables and the like —every missed prediction or branch is energy wasted. Compare that to an Itanium part, where you almost need to phone up your electricity supplier for permission to power one up.<br />
<br />
The <a href="https://pdfs.semanticscholar.org/c155/5f718910849bab9d3dd73842b408dc79420c.pdf">Niagara 2</a> part pushed it ahead further to a level that is impressive to read. At the same time, you can see a great engineering team struggling with a fab process behind what Intel could do, Sun trying to fight the x86 line, and, well, losing.<br />
<br />
Where are the parts now? Oracle's <a href="https://community.oracle.com/servlet/JiveServlet/previewBody/1017902-102-1-163307/T8M8_Architecture_WP_20170914.pdf">M8 CPU PDF</a> touts its Out Of Order execution —speculative execution—, and data/instruction prefetch. I fear it's now got the same weaknesses of everything else. Apparently the java 8 streams API gets bonus speedup, which reminds me to post something criticising Java checked execution for making that API unusable for the <tt>throws IOException</tt> Hadoop codebase. As for the virtualization support, again, you'd need to think about pinning to a CPU. There's also that $L1-$L3 cache hit/miss problem: something speculating in one CPU could evict cached data observable to others, unless speculative memory fetches weren't a feature of the part.<br />
<br />
They look nice-but-pricey servers; if you are paying the Oracle RDBMs tax the all-in-one price might mitigate that. Overall though, with a focus on many fast-but-dim parts over a smaller number of "throw Si at maximum single thread" architecture of recent x86 designs <i>may</i> provide opportunities for future designs to be more resistant to attacks related to speculative execution. I also think I'd like to see their performance numbers running Apache Spark 2.3 with one executor per thread and lots of RAM.<br />
<br />
<b>Update April 3 2018</b>: I see within hours of this posting rumour start that Apple is looking at ARM parts for macbooks in 2020+. Not a coincidence! Actually it is, but because the ARM parts are simpler, they may be less exposed to specex-based attacks, even though Meltdown did affect those implementations which did speculative memory fetches. I think the Niagara architecture has more potential, but it probably works best in massively-multithreaded server side systems, not laptops where latency is the performance metric, not throughput.<br />
<br />
[Photo: my 2008 Fizik Gobi saddle snapped one of its Titanium rails last week. Made it home in the rain, but a sign that after a decade, parts just wear out.]Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-20874820982782602142018-01-29T04:50:00.003+00:002018-01-31T00:50:27.091+00:00Advanced Deanonymization through Strava<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/9675384786/in/album-72157667789976296/" title="Slow Except for Strava"><img alt="Slow Except for Strava" height="375" src="https://farm4.staticflickr.com/3696/9675384786_12ddfb564a.jpg" width="500" /></a><br />
<script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
Strava is getting some bad press, because <a href="https://www.theguardian.com/world/2018/jan/28/fitness-tracking-app-gives-away-location-of-secret-us-army-bases">its heatmap can be used</a> to infer the existence and floorplan of various US/UK military and govt sites.<br />
<br />
I covered this briefly in my Berlin Buzzwords 2016 <a href="https://berlinbuzzwords.de/16/session/household-infosec-post-sony-era">Household INFOSEC talk</a> , though not into that much detail about what's leaked, what a Garmin GPS tracker is vulnerable to (Not: classic XML entity/XInclude attacks, but a malicious site could serve up a subverted GPS map that told me the local motorway was safe to cycle on).<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/27304303722/in/album-72157667789976296/" title="Untitled"><img alt="Untitled" height="375" src="https://farm8.staticflickr.com/7046/27304303722_73e6b78332.jpg" width="500" /></a><br />
<script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
Here are some things Strava may reveal<br />
<br />
<ul>
<li>Whether you run, swim, ski or cycle.</li>
<li>If you tell it, <a href="https://www.strava.com/settings/gear">what bicycles you have</a>.</li>
<li>Who you go out on a run or ride with</li>
<li>When you are away from your house</li>
<li>Where you commute to, and when</li>
<li>Your fitness, and whether it is getting better or worse.</li>
<li>When you travel, what TZ, etc.</li>
</ul>
<br />
How to lock down your account?<br />
<br />
I only try to defend against drive-by attacks, not nation states or indeed, anyone competent who knows who I am. For Strava then, my goal is: do not share information about where my bicycles are kept, nor those of my friends. I also like to not share too much about the bikes themselves. This all comes secondary to making sure that nobody follows me back from a ride over the Clifton Suspension Bridge (standard attack: wait at the suspension bridge, cycle after them. Standard defence: go through the clifton backroads, see if you are being followed). And I try to make sure all our bikes are locked up, doors locked etc. The last time one got stolen was due to a process failure there (unlocked door) and so the defences fell to some random drug addict rather than anyone using Strava. There's a moral there, but it's still good to lock down your data against tomorrow's Strava attacks, not just today's. My goal: keep my data secure enough to be safe from myself. <br />
<ol>
<li>I don't use my real name. You can use a single letter as a surname, an "!", or an emoji.</li>
<li>And I've made sure that none of the people I ride regularly do so either</li>
<li>I have a private area around my house, and those of friends.</li>
<li>All my bikes have boring names "The Commuter", not something declaring value.</li>
<li>I have managed fairly successfully to stay of the KoM charts, apart from <a href="https://www.strava.com/segments/2560468">this climb</a> which I regret doing on so many levels.</li>
</ol>
For a long time I didn't actually turn the bike computer on until I'd got to the end of the road. I've got complacent there. Even though Strava strips the traces from the private zone when publishing, it does appear to declare the ride distance <i>as the full distance</i>. Given enough rides of mine, you can infer the radius of that privacy zone (fix? Have two overlapping circles?), and the distance on road from the cutoff points to where my house is (overlapping circles won't fix that). You'd need to strip out the start/stop points before uploading to strava (hard) or go back to only switching on recording once you were a randomish distance from your setoff point.<br />
<br />
I haven't opted out of the Strava Heatmap, as I don't go anywhere that people care about. That said, there's always concerns in our local MTB groups that Strava leaks the non-official trails to those people who view stopping MTB riders as their duty. A source of controversy.<br />
<br />
Now, how would I go for someone else's strava secrets?<br />
<br />
You can assume that anyone who scores high in a mountain bike trail is going to have an MTB worth stealing, same for long road climbs.<br />
<ol>
<li>Ride IDs appear sequential, so you could harvest a days' worth and search.<br />
</li>
<li>Join the same cycling club as my targets, see if they publish rides. Fixes: don't join open clubs, ever, and verify members of closed clubs.<br />
</li>
<li>Strava KoM chart leakage. Even if you make your rides private, if you get on top 10 riders for that day or whatever, you become visible.<br />
</li>
</ol>
The fact that you can infer nation-state secrets is an interesting escalation. Currently it's the heatmap which is getting the bad press, which is part of the dataset that Strava offer commercially to councils. FWIW, the selection bias on Strava data (male roadies or mountain bikers) means that its not that good. If someone bought our local data, they'd infer that muddy wood trails with trees and rocks are what the city needs. Which is true, but it doesn't address the lack of any safe way to cross the city.<br />
<br />
What is interesting about the heat map, and not picked up on yet, is that you can potentially deanonymize people from it.<br />
<br />
First, find somewhere sensitive, like say, The UK <a href="https://labs.strava.com/heatmap/#16.79/-4.81840/56.06626/hot/all">Faslane Nuclear Submarine Base</a>. Observe the hot areas, like how people run in rectangles in the middle.<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/39248135784/in/dateposted/" title="Faslane Heat Map"><img alt="Faslane Heat Map" height="423" src="https://farm5.staticflickr.com/4603/39248135784_66ceffbca8.jpg" width="500" /></a><br />
Now, go to <a href="http://www.mapmyride.com/">MapMyRide</a> and log in. Then head over to <a href="http://www.mapmyride.com/routes/create/">create a new route using the satellite data</a><br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/26085816258/in/dateposted/" title="Created Route from the hotspot map"><img alt="Created Route from the hotspot map" height="329" src="https://farm5.staticflickr.com/4655/26085816258_00158699eb.jpg" width="500" /></a><br />
<script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
Download <a href="http://www.mapmyride.com/routes/view/1927668041">the GPX file</a>. This contains the Lat/Long values of the route<br />
<script src="https://gist.github.com/steveloughran/152f05c23d050cec2fcc682498b25a4f.js"></script><br />
If you try to upload it to strava, it'll reject it as there's no timing data. So add it, using some from any real GPX trace as a reference point. Doesn't have to be valid time, just make it slow enough that Strava doesn't think you are driving and tell you off for cheating.<br />
<script src="https://gist.github.com/steveloughran/92c72fec78318049e1677381e8e111ec.js"></script><br />
Upload the file as a run, creating <a href="https://www.strava.com/activities/1380808157">a new activity</a><br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/39927291442/in/dateposted/" title="Faked run uploaded"><img alt="Faked run uploaded" height="204" src="https://farm5.staticflickr.com/4659/39927291442_9c3da46890.jpg" width="500" /></a><br />
<script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
The next step is the devious one. "Create a segment", and turn part/all of the route into a Strava segment.<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/25087944047/in/photostream/" title="Creating a segment from the trace"><img alt="Creating a segment from the trace" height="313" src="https://farm5.staticflickr.com/4632/25087944047_27a1c73518.jpg" width="500" /></a><br />
<script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
Once strava has gone through its records, you'll be able to see the overall top 10 runners per gender/age group, when they ran, it who they ran with. And, if their profile isn't locked down enough: which other military bases they've been for runs on.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/39958987881/in/photostream/" title="And now we wait to see who else did it"><img alt="And now we wait to see who else did it" height="218" src="https://farm5.staticflickr.com/4655/39958987881_9d631a50e5.jpg" width="500" /></a><br />
<script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
I have no idea who has done this run; whether there'll be any segment matches at all. If not, maybe the trace isn't close enough to the real world traces, everyone runs round clockwise, or, hopefully, people are smart enough to mark the area as a private. I'll leave strava up overnight to see what it shows, then delete the segment and run.<br />
<br />
Finally, Berlin Buzzwords CFP is open, still offering <a href="http://steveloughran.blogspot.com/2018/01/berlin-buzzwords-cfp-with-offer-of.html">to help with draft proposals</a>. We can now say it's the place where Strava-based infosec issues were covered 2 years before it was more widely known.<br />
<br />
<b>Update 2018-01-29T21:13</b>. I've removed the segment.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/25103142037/in/dateposted/" title="Removing Segment"><img alt="Removing Segment" height="315" src="https://farm5.staticflickr.com/4604/25103142037_a096d88ff2.jpg" width="500" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
Some people's names were appearing there, showing that, yes, you can bootstrap from a heatmap to identification of individual people who have run the same route.<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/39942095702/in/dateposted/" title="Segment top 17, as discovered"><img alt="Segment top 17, as discovered" height="500" src="https://farm5.staticflickr.com/4664/39942095702_6fb4c8a6c4.jpg" width="485" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
There's no need to blame the people here, so I've pulled the segment to preserve their anonymity. But as anyone else can do it, they should still mark all govt. locations where they train as private areas, so getting included from the heatmap and strava segments.<br />
<br />
I don't know what Strava will do long term, but to stop it reoccurring, they'll need to have a way to mark an area as "private area for all users". Doable. Then go to various governments and say "Give us a list of secret sites you don't want us to cover". Which, unless the governments include random areas like mountain ranges in mid wales, is an interesting list of its own.<br />
<br />
<b>Update 2018-01-30T16:01</b> to clarify on marking routes private<br />
<br />
<ol>
<li>All ride/runs marked as "private" don't appear in the leader boards</li>
<li>All ride/runs marked as "don't show in leader boards" don't appear</li>
<li>Nor do any activities in a privacy zone make it onto a segment which starts/ends there</li>
<li><i>But: </i>"enhanced privacy mode" activities do. That is: even you can't see an individuals's activities off their profile, you can see the rides off the leaderboard.</li>
</ol>
<div>
<b>Update 2018-01-31T00:30 <a href="https://news.ycombinator.com/item?id=16263247">Hacker News coverage</a></b></div>
<div>
<br /></div>
<div>
I have made Hacker news. Achievement Unlocked!</div>
<div>
<br /></div>
<div>
Apparently</div>
<div>
<blockquote class="tr_bq">
<i>This is neither advanced nor denanonymization (sic).</i></blockquote>
<blockquote class="tr_bq">
<i>They basically pluck an interesting route from the hotmap (as per other people's recent discovery), pretend that they have also run/biked this route and Strava will show them names of others who run/biked the same way. That's clever, but that's not "advanced" by any means.</i></blockquote>
<blockquote class="tr_bq">
<i>It's also not a deanonymization as there's really no option in Strava for public _anonymous_ sharing to begin with.</i></blockquote>
</div>
<div>
<br /></div>
<div>
1. Thanks for pointing out the typo. Fixed.</div>
<div>
<br /></div>
<div>
2. It's bootstrapping from nominally anon heatmap data to identifying the participants of the route. And unless people use nicknames (only 2 of the 16 in the segment above) did, then you reveal your real name. And as it shows the entire route when you click through the timestamp, you get to see where they started/finished, who if anyone they went with, etc, etc. You may not their real name, but you know a lot about them.</div>
<div>
<br /></div>
<div>
3. "It''s not advanced". Actually, what Strava do behind the scenes is pretty advanced :). They determine which of all recorded routes they match that segment, within 24h. Presumably they have a record of the bounds of every ride, so first select all rides whose bounds completely enclose the segment. Then they have to go through all of them to see if there is a fraction of their trail which matches.. I presume you'd go with the start co-ord and scan the trace to see if any of the waypoints <i>*or inferred bits of the line between two recorded waypoints*</i> is in the range of that start marker. If so, carry on along the trace looking for the next waypoint of the segment; giving up if the distance travelled is >> greater than the expected distance. And they do that for all recorded events in past history. </div>
<div>
<br /></div>
<div>
All I did was play around with a web UI showing photographs from orbiting cameras, adjacent to a map of the world with humanities' activities inferred by published traces of how phones, watches and bike computers calculated their positions from a set of atomic clocks, uploaded over the internet to a queue in Apache Kafka, processed for storage in AWS S3, whose consistency and throttling is the bane of my life and rendered via Apache Kafka, as covered in <a href="https://medium.com/strava-engineering/rebuilding-the-segment-leaderboards-infrastructure-part-3-design-of-the-new-system-39fdcf0d5eb4">Strava Engineering</a>. That is impressive work. Some of their analysis code is probably running through lines of code which I authored, and I'm glad to have contributed to something which is so useful, and, for the heatmap, beautiful to look at. </div>
<div>
<br /></div>
<div>
So no, I wasn't the one doing the advanced engineering —but I respect those who did, and pleased to see the work of people I know being used in the app.</div>
Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-3912869276826546304.post-87572399725474492452018-01-10T11:16:00.002+00:002018-01-10T11:20:47.149+00:00Berlin Buzzwords: CFP with an offer of abstract reviewBerlin Buzzwords <a href="https://berlinbuzzwords.de/18/news/call-submissions-now-open">CFP is open</a>, which, along with <a href="https://dataworkssummit.com/berlin-2018/">Dataworks Summit</a> in April, is going to make Berlin the place for technical conferences in 2018.<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/38715166195/in/dateposted/" title="Berlin"><img alt="Berlin" height="500" src="https://farm5.staticflickr.com/4668/38715166195_3c02485dc1.jpg" width="375" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
As with last year, I'm offering to review people's abstracts before they're submitted; help edit them to get the text to be more in the style that reviewers to tend to go for.<br />
<br />
When we review the talks, we look for interesting things in the themes of the conference, try and balance topics, pick the fun stuff. And we measure that (interesting, fun) on the prose of the submissions, knowing that they get turned into the program for the attendees: we want the text to be compelling for the audience.<br />
<br />
The target audiences for submissions then are twofold. The ultimate audience is the attendees. The reviewers? We're the filter in the way.<br />
<br />
But alongside that content, we want a diverse set of speakers, including people who have never spoken before. Otherwise it gets a bit repetitive (oh, no, stevel will talk on something random, again), and that's no good for the audience. But how do we regulars get in, given that the submission process is anonymous?<br />
<br />
We do it by writing abstracts which we know the reviewers are looking for.<br />
<br />
The review process, then, is a barrier to getting new speakers into the talk, which is dangerous: we all miss out on the insights from other people. And for the possible speakers, they miss out on the fun you have being a speaker at a conf, trying to get your slides together, discovering an hour in advance that you only have 20 min and not 30 for your talk and picking 1/3 of the slides to hide. Or on a trip to say, Boston, having your laptop have a hardware fault and you being grateful you snapshotted it onto a USB stick before you set off. Those are the bad points. The good bits? People coming up to you afterwards and getting into discussion about how they worked on similar stuff but came up with a better answer, how you learn back from the audience about related issues, how you can spend time in Berlin in cafes and wandering round, enjoying the city in early summer, sitting outside at restaurants with other developers from around Europe and the rest of the world, sharing pizza and beer in the the evening. Berlin is a fun place for conferences.<br />
<br />
Which is why people should submit a talk, even if they've never presented before. And to help them, feel free to stick a draft up on google docs & then share with edit rights to my gmail address, steve.loughran@ ; send me a note and I'll look through.<br />
<br />
yes, I'm one of the reviewers, but in my reviews I call out that I helped with the submission: fairness is everything.<br />
<br />
Last year only one person, <a href="https://berlinbuzzwords.de/users/raam-rosh-hai">Raam Rosh Hai</a>, took this offer up, And he got in, with his talk <a href="https://berlinbuzzwords.de/17/session/how-build-recommendation-system-overnight">How to build a recommendation system overnight</a>! This means that so far, all drafts which have been through this pre-review of submissions process, has a 100% success rate. And, if you look at the video, you'll see its a good talk: he deserved that place.<br />
<br />
<br />
Anyway, Submission deadline: Feb 14. Conference June 10-12. Happy to help with reviewing draft abstracts.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-75697813348763866492018-01-08T12:07:00.004+00:002018-01-08T12:16:09.077+00:00Trying to Meltdown in Java -failing. ProbablyMeltdown has made for an "interesting" week in computing, as everyone is learning about/revising their knowledge of Speculative Execution. FWIW, I'd recommend the latest version of Patterson and Hennessey, <a href="https://www.amazon.com/Computer-Architecture-Sixth-Quantitative-Approach/dp/0128119055/ref=dp_ob_title_bk">Computer Architecture A Quantitative Approach</a>. Not just for its details on speculative execution, but because it is the best book on microprocessor architecture and design that anyone has ever written, and lovely to read. I could read it repeatedly and not get bored.(And I see I need to get the 6th edition!)<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/25663766638/in/dateposted/" title="Stokes Croft drugs find"><img alt="Stokes Croft drugs find" height="500" src="https://farm5.staticflickr.com/4728/25663766638_26159f84dc.jpg" width="375" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
This weekend, rather than read Patterson and Hennessey(*) I had a go to see if you could implement the meltdown attack in Java, hence in mapreduce, spark, or other non-native JAR<br />
<br />
My <a href="https://github.com/steveloughran/speculate/blob/master/doc/java_and_meltdown.md">initial attempt failed</a> provided the part only speculates one branch in.<br />
<br />
More specifically <i>"the range checking Java does on all array accesses blocks the standard exploit given steve's assumptions"</i>. You can speculatively execute the out of bounds query, but you can't read the second array at an offset which will trigger $L1 cache loading. <br />
<br />
If there's a way to do a reference to two separate memory locations which doesn't trigger branching range checks, then you stand a chance of pulling it off. I tried that using the ? : operator pair, something like<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">String ref = data ? refA : ref B;</span><br />
<br />
which I hoped might compile down to something like<br />
<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">mov ref, refB</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">cmp data, 0</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">cmovnz ref, refB</span><br />
<br />
This would do the move of the reference in the ongoing speculative branch, so, if "ref" was referenced in any way, trigger the resolution<br />
<br />
In my experiment (2009 macbook pro with OSX Yosemite + latest java 8 early access release), a branch was generated ... but there are some refs in the open JDK JIRA to using <span style="font-family: "courier new" , "courier" , monospace;">CMOV</span>, including the fact that hotspot compiler may be generating it if it things the probability of the move taking place is high enough.<br />
<br />
Accordingly, I can't say <i>"the hotspot compiler doesn't generate exploitable codepaths"</i>, only "<i>in this experiment, the hotspot compiler didn't appear to generate an exploitable codepath".</i><br />
<br />
Now the code is done, I might try on a Linux VM with Java 9 to see what is emitted<br />
<ol>
<li>If you can get the exploit in, then you'd have access to other bits of the memory space of the same JVM, irrespective of what the OS does. That means one thread with a set of Kerberos tickets could perhaps grab the secrets of another. IT'd be pretty hard, given the way the JVM manages objects on the heap: I wouldn't know where to begin, but it would become hypothetically possible.</li>
<li>If you can get native code which you don't trust loaded into the JVM, then it can do whatever it wants. The original meltdown exploit is there. But native code running in JVM is going to have unrestricted access to the entire address space of the JVM -you don't need to use meltdown to grab secrets from the heap. All meltdown would do here is offer the possibility of grabbing kernel space data —which is what the OS patch does.</li>
</ol>
<br />
Anyway, I believe my first attempts failed <i>within the context of this experiment.</i><br />
<br />
Code-wise, this kept me busy on Sunday afternoon. I managed to twist my ankle quite badly on a broken paving stone on the way to patisserie on Saturday, so sat around for an hour drinking coffee in Stokes Croft, then limped home, with all forms of exercise crossed off the TODO list for the w/e. Time for a bit of Java coding instead, as a break for what I'd been doing over the holiday (C coding <a href="https://github.com/steveloughran/pingish">a version of Ping</a> which outputs CSV data and <a href="https://github.com/steveloughran/zero-rename-committer">a LaTeX paper on the S3A committers</a>)<br />
<br />
It took as much time trying get hold of the OS/X disassembler for generated code as it did coding the exploit. Why so? Oracle have replaced all links in Java.sun.net which would point to the reference dynamic library with a 302 to the base Java page telling you how lucky you are that Java is embedded in cars. Or you see a ref to on-stack-replacement on a page in Project Kenai, under a URL which starts with <a href="https://kenai.com/">https://kenai.com/</a>, point your browser there and end up on <a href="http://www.oracle.com/splash/kenai.com/decommissioning/index.html">http://www.oracle.com/splash/kenai.com/decommissioning/index.html</a> and the message "We're sorry the kenai.com site has closed."<br />
<br />
All the history and knowledge on JVM internals and how to work there is gone. You can find the blog posts from four years ago on the topic, but the links to the tools are dead.<br />
<br />
This is truly awful. It's the best argument I've seen for publishing this info as PDF files with DOI references, where you can move the artifact around, but citeseer will always find it. If the information doesn't last five years, then<br />
<br />
The irony is, it means that because Oracle have killed all those inbound links to Java tools, they're telling the kind of developer who wants to know these things to go away. That's strategically short-sighted. I can understand why you'd want to keep the cost of site maintenance down, but really, breaking every single link? It's a major loss to the Java platform —especially as I couldn't even find a replacement.<br />
<br />
I did manage to find a copy of the openjdk tarball people send you could D/L and run make on, but it was on a freebsd site, and even after a ./Configure && make, it broke trying to create a bsd dynlib. Then I checked out the full openjdk source tree, branch -8, installed the various tools and tried to build there. Again, some error. I ended up finding a copy of the needed hsdis-amd64.dylib library <a href="https://github.com/evolvedmicrobe/benchmarks/blob/master/hsdis-amd64.dylib">on Github</a>, but I had to then spend some time looking at <a href="https://github.com/evolvedmicrobe">evolvedmicrobe's w</a>ork &c to see if I could trust this to "probably" not be malware itself. I've replicated the JAR in <a href="https://github.com/steveloughran/speculate">the speculate module</a>, BTW.<br />
<br />
Anyway, once the disassembler was done and the other aspects of hotspot JIT compilation clear (if you can't see the method you wrote, run the loop a few thousand more times), I got to see some well annotated x86-64 assembler. Leaving me with a new problem: x86-64 assembler. It's a lot cleaner than classic 32 bit x86: having more registers does that, especially as it gives lots of scope for improving how function parameters and return values are managed.<br />
<br />
What next? This is only a spare time bit of work, and now I'm back from my EU-length xmas break, I'm doing other things. Maybe next weekend I'll do some more. At least now I know that exploiting meltdown from the JVM is not going be straightforward.<br />
<br />
Also I found it quite interesting playing with this, to see when the JVM kicks out native code, what it looks like. We code so far from the native hardware these days, its too "low level". But the new speculation-side-channel attacks have shown that you'd better understand modern CPU architectures, including how your high-level code gets compiled down.<br />
<br />
I think I should submit a <a href="https://berlinbuzzwords.de/18/news/call-submissions-now-open">berlin buzzwords</a> talk on this topic. <br />
<br />
(*) It is traditional to swap the names of the author on every use. If you are a purist you have to remember the last order you used.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-91240890382585140592018-01-04T13:43:00.001+00:002018-01-04T13:43:24.804+00:00Speculation<br />
Speculative execution has been intel's strategy for keeping the x86 architecture alive since the P6/Pentium Pro part shipped in '95.<br />
<br />
I remember coding explicitly for the P6 in a project in 1997; HPLabs was working with HP's IC Division to build their first CMOS-camera IC, which was an interesting problem. Suddenly your IC design needs to worry about light, aligning the optical colour filter with the sensors, making sure it all worked.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/39494089891/in/dateposted/" title="Eyeris"><img alt="Eyeris" height="500" src="https://farm5.staticflickr.com/4690/39494089891_fd982b3172.jpg" width="375" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
I ended up writing the code to capture the raw data at full frame rate, streaming to HDD, with an option to alternatively render it with/without the colour filtering (algorithms from another bit HPL team). Which means I get to nod knowingly when people complain about "raw" data. Yes, it's different for every device precisely because its raw.<br />
<br />
The data rates of the VGA-resolution sensor via the PCI boards used to pull this off meant that a both cores of a multiprocessor P6 box were needed. It was the first time I'd ever had a dual socket system, but both sockets were full with the 150MHz parts and with careful work we could get away with the "full link rate" data capture which was a core part of the qualification process. It's not enough to self test the chips any more see, you need to look at the pictures.<br />
<br />
Without too many crises, everything came together, which is why I have a framed but slightly skewed IC part to hand. And it's why I have memories of writing multithreaded windows C++ code with some of the core logic in x86 assembler. I also have memories of ripping out that ASM code as it turned out that it was broken, doing it as C pointer code and having it be just as fast. That's because: C code compiled to x86 by a good compiler, executed on a great CPU, is at least performant as hand-written x86 code by someone who isn't any good at assembler, and can be made to be correct more easily by the selfsame developer.<br />
<br />
150 MHz may be a number people laugh at today, but the CPU:RAM clock ratios weren't as bad as they are today: cache misses are less expensive in terms of pipeline stalls, and those parts were fast. Why? Speculative and out of order execution, amongst other things<br />
<ol>
<li>The P6 could provisionally guess which way a branch was going to go, speculatively executing that path until it became clear whether or not the guess was correct -and then commit/abort that speculative code path.</li>
<li>It uses a branch predictor to make that guess on the direction a branch was taken, based on the history of previous attempts, and a default option (FWIW, this is why I tend to place the most likely outcome first in my if() statements; tradition and superstition).</li>
<li>It could execute operations out of order. That is, it's predecessor, the P5, was the last time mainstream intel desktop/server parts executed x86 code in the order the compiler generated them, or the human wrote them.</li>
<li>register renaming meant that even though the parts had a limited set of registers, those OOO operations could reuse the same EAX, EBX, ECX registers without problems.</li>
<li>It had caching to deal with the speed mismatch between that 150 MHz CPU & RAM.</li>
<li>It supported dual CPU desktops, and I believe quad-CPU servers too. They'd be called "dual core" and "quad core" these days and looked down at.</li>
</ol>
<br />
Being the first multicore system I'd ever used, it was a learning experience. First was learning how too much windows NT4 code was still not stable in such a world. NTFS crashes with all all volumes corrupted? check. GDI rendering triggering kernel crash? check. And on a 4-core system I got hold of, everything crashed more often. Lesson: if you want a thread safe OS, give your kernel developers as many cores as you can.<br />
<br />
OOO forced me to learn about the x86 memory model itself: barrier opcodes, when things could get reordered and when they wouldn't. Summary: don't try and be clever about synchronization, as your assumptions are invalid.<br />
<br />
Speculation is always an unsatisfactory solution though. Every mis-speculation is lost cycles. And on a phone or laptop, that's wasted energy as much as time. And failed reads could fill up the cache with things you didn't want. I've tried to remember if I ever tried to use speculation to preload stuff if present, but doubt it. The CMOV command was a non-branching conditional assignment which was better, even if you had to hand code it. The PIII/SSE added the <a href="https://c9x.me/x86/html/file_module_x86_id_252.html">PREFETCH</a> opcode so you could a non-faulting hinted prefetch which you could stick into your non-branching code, but that was a niche opcode for people writing games/media codecs &c. And as Linus points out, <a href="https://www.realworldtech.com/forum/?threadid=132668&curpostid=132772">what was clever for one CPU model turns out to be a stupid idea a generation later</a>. (arguably, that applies to Itanium/IA-64, though as it didn't speculate, it doesn't suffer from the Spectre & Meltdown attacks).<br />
<br />
Speculation, then: a wonderful use of transistors to compensate for how we developers write so many if() statements in our code. Wonderful, it kept the x86 line alive and so helped Intel deliver shareholder value and keep the RISC CPU out of the desktop, workstation and server businesses. Terrible because :"transistors" is another word for "CPU die area" with its yield equations and opportunity cost, and also for "wasted energy on failed speculations". If we wrote code which had fewer branches in it, and that got compiled down to CMOV opcodes, life would be better. But we have so many layers of indirection these days; so many indirect references to resolve before those memory accesses. Things are probably getting worse now, not better. <br />
<br />
This week's speculation-side-channel attacks are fascinating then. These are very much architectural issues about speculation and branch prediction in general, rather than implementation details. Any CPU manufacturer whose parts do speculative execution has to be worried here, even if there's no evidence that your shipping parts aren't vulnerable to the current set of attacks. The whole point about speculation is to speed up operation based on the state of data held in registers or memory, so the time-to-execute is always going to be a side-channel providing information about the data used to make a branch.<br />
<br />
<br />
The fact that you can get at kernel memory, even from code running under a hypervisor, means, well, a lot. It means that VMs running in cloud infrastructure could get at the data of the host OS and/or those of other VMs running on the same host (those S3 SSE-C keys you passed up to your VM? 0wned, along with your current set of IAM role credentials). It potentially means that someone else's code could be playing games with branch prediction to determine what codepaths your code is taking. Which, in public cloud infrastructure is pretty serious, as the only way to stop people running their code alongside yours is currently to pay for the top of the line VMs and hope they get a dedicated part. I'm not even sure that dedicated cores in a multicore CPU are sufficient isolation, not for anything related to cache-side-channel attacks (they should be good for branch prediction, I think, if the attacker can't manipulate the branch predictor of the other cores).<br />
<br />
I can imagine the emails between cloud providers and CPU vendors being fairly strained, with the OEM/ODM teams on the CC: list. Even if the patches being rolled out mitigate things, if the slowdown on switching to kernelspace is as expensive as hinted, then that slows down applications, which means that the cost of running the same job in-cloud just got more expensive. Big cloud customers will be talking to their infrastructure suppliers on this, and then negotiating discounts for the extra CPU hours, which is a discount the cloud providers will expected to recover when they next buy servers. I feel as sorry for the cloud CPU account teams as I do for the x86 architecture group.<br />
<br />
Meanwhile, there's an interesting set of interview questions you could ask developers on this topic.<br />
<ol>
<li>What does the generated java assembly for the Ival++ on a java long look like?</li>
<li>What if the long is marked as volatile?</li>
<li>What does the generated x86 assembler for a Java Optional<AtomicLong> opt.map(AtomicLong::addAndGet(1)) look like?</li>
<li>What guarantees do you get about reordering?</li>
<li>How would you write code which attempted to defend against speculation timing attacks?<br />
</li>
</ol>
<br />
I don't have the confidence to answer 1-4 myself, but I could at least go into detail about what I believed to be the case for 1-3; for #4 I should <a href="http://gee.cs.oswego.edu/dl/jmm/cookbook.html">do some revision</a>.<br />
<br />
As for #5, defending. I would love to see what others suggest. Conditional CMOV ops could help against branch-prediction attacks, by eliminating the branches. However, searching for references to CMOV and the JDK turns up some issues which imply that <a href="https://bugs.openjdk.java.net/browse/JDK-8039104">branch prediction can sometimes be faster</a>...", including "<a href="https://bugs.openjdk.java.net/browse/JDK-8039104">JDK-8039104. Don't use Math.min/max intrinsic on x86</a>" it may be that even CMOV gets speculated on; with the CPU prefetching what is moved and keeping the write uncommitted until the state of the condition is known.<br />
<br />
I suspect that the next edition of Hennessy and Patterson, "Computer Architecture, a Quantitative Approach" will be covering this topic.I shall look forward to with even greater anticipation than I have had for all the previous, beloved, versions.<br />
<br />
As for all those people out there panicking about this, worrying if their nearly-new laptop is utterly exposed? You are running with Flash enabled on a laptop you use in cafe wifis without a VPN and with the same password, "k1tten", you use for gmail and paypal. You have other issues.<br />
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-47486924498702581832017-11-23T11:18:00.000+00:002017-11-23T11:18:08.966+00:00How to play with the new S3A committers<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/10490359744/in/album-72157636963150965/" title="Untitled"><img alt="Untitled" height="333" src="https://farm6.staticflickr.com/5516/10490359744_2110f51b66.jpg" width="500" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
Following up from yesterday's <a href="http://steveloughran.blogspot.co.uk/2017/11/subatomic.html">post on the S3A committers</a>, here's what you need for picking up the committers.<br />
<ol>
<li><a href="https://github.com/apache/hadoop">Apache Hadoop trunk</a>; builds to 3.1.0-SNAPSHOT: </li>
<li>The <a href="https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committers.md">documentation on use</a>.</li>
<li>An AWS keypair, try not to commit them to git. Tip for the Uber team: <a href="https://github.com/awslabs/git-secrets">git-secrets </a>is something you can add as a checkin hook. Do as I do: <a href="http://steveloughran.blogspot.co.uk/2016/04/testing-against-s3-and-object-stores.html">keep them elsewhere</a>.</li>
<li>If you want to use the magic committer; <a href="https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md">turn S3Guard on</a>. Initially I'd use the staging committer, specificially the "directory" on. </li>
<li>switch s3a:// to use that committer: fs.s3a.committer.name = partitioned</li>
<li>Run your MR queries</li>
<li>look in _SUCCESS for committer info. 0-bytes long: classic FileOutputCommitter. Bit of JSON naming committer, files committed and some metrics (<a href="https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/files/SuccessData.java">SuccessData</a>) and you are using an S3 committer.</li>
</ol>
If you do that: I'd like to see the numbers comparing FileOutputCommitter (which must have S3Guard) and the new committers. For benchmark consistency, leave S3Guard on.<br />
<br />
If you can't get things to work because the docs are wrong: file a JIRA with a patch. If the code is wrong: submit a patch with the fix & tests. <br />
<br />
Spark?<br />
<ol>
<li><a href="https://github.com/apache/spark">Spark Master</a> has a couple of patches to deal with integration issues (<a href="https://issues.apache.org/jira/browse/SPARK-21762">FNFE on magic output paths</a>, Parquet being <a href="https://issues.apache.org/jira/browse/SPARK-22217">over-fussy about committers</a>, I think the committer binding has enough workarounds for these to work with Spark 2.2 though.</li>
<li>Checkout my <a href="https://github.com/hortonworks-spark/cloud-integration/">cloud-integration for Apache Spark</a> repo, and its production-time redistributable, <a href="https://github.com/hortonworks-spark/cloud-integration/tree/master/spark-cloud-integration">spark-cloud-integration</a>.</li>
<li>Read <a href="https://github.com/hortonworks-spark/cloud-integration/blob/master/spark-cloud-integration/src/main/site/markdown/index.md">its docs</a> and use</li>
<li>If you want to use Parquet over other formats, <a href="https://github.com/hortonworks-spark/cloud-integration/blob/master/spark-cloud-integration/src/main/scala/com/hortonworks/spark/cloud/commit/BindingParquetOutputCommitter.scala">use this committer</a>. </li>
<li>Again,. check _SUCCESS to see what's going on. </li>
<li>There's <a href="https://github.com/hortonworks-spark/cloud-integration/tree/master/cloud-examples">a test module</a> with various (scaleable) tests as well as a copy and paste of some of the Spark SQL test.</li>
<li>Spark can work with the Partitioned committer. This is a staging committer which only worries about file conflicts in the final partitions. This lets you do in-situ updates of existing datasets, adding new partitions or overwriting existing ones, while leaving the rest alone. Hence: no need to move the output of a job into the reference datasets. </li>
<li>Problems. <a href="https://github.com/hortonworks-spark/cloud-integration/issues">File an issue</a>. I've just seen Ewan has a couple of PRs I'd better look at, actually.</li>
</ol>
Committer-wise, that spark-cloud-integration module is ultimately transient. I think we can identify those remaining issues with committer setup in spark core, after which a hadoop 3.0+ specific module should be able to work out the box with the new committers.<br />
<br />
There's still other things there, like<br />
<ul>
<li>Cloud store optimised <a href="https://github.com/hortonworks-spark/cloud-integration/blob/master/spark-cloud-integration/src/main/scala/org/apache/spark/streaming/hortonworks/CloudInputDStream.scala">file input stream source</a>. </li>
<li><a href="https://github.com/hortonworks-spark/cloud-integration/blob/master/spark-cloud-integration/src/main/scala/org/apache/spark/hortonworks/ParallelizedWithLocalityRDD.scala">ParallizedWithLocalityRDD</a>: and RDD which lets you provide custom functions to declare locality on a row-by-row basis. Used in my demo of implementing <a href="https://github.com/hortonworks-spark/cloud-integration/blob/master/spark-cloud-integration/src/main/scala/com/hortonworks/spark/cloud/applications/CloudCp.scala">DistCp in Spar</a>k. Every row is a filename, which gets pushed out to a worker close to the data, it does the upload. This is very much a subset of distCP, but it shows this: you can have with with RDDs and cloud storage. </li>
<li>+ all the tests </li>
</ul>
I think maybe Apache Bahir would be the ultimate home for this. For now, a bit too unstable.<br />
<br />
(photo: spices on sale in a Mombasa market)Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-3912869276826546304.post-12087993523977696872017-11-22T19:01:00.002+00:002017-11-22T19:02:14.493+00:00subatomic<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/10490424604/in/album-72157636963150965/" title="Untitled"><img alt="Untitled" height="333" src="https://farm4.staticflickr.com/3713/10490424604_0a0da88bee.jpg" width="500" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
I've just committed <a href="https://issues.apache.org/jira/browse/HADOOP-13786">HADOOP-13786 Add S3A committer for zero-rename commits to S3 endpoints.</a>. Contributed by Steve Loughran and Ryan Blue.<br />
<br />
This is a serious and complex piece of work; I need to thank:<br />
<ol>
<li>Thomas Demoor and Ewan Higgs from WDC for their advice and testing. They understand the intricacies of the S3 protocol to the millimetre. </li>
<li>Ryan Blue for his <a href="https://github.com/rdblue/s3committer/">Staging-based S3 committer</a>. The core algorithms and code will be in hadoop-aws come Hadoop 3.1.</li>
<li>Colleagues for their support, including the illustrious Sanjay Radia, and Ram Venkatesh for letting me put so much time into this.</li>
<li>Reviewers, especially Ryan Blue, Ewan Higgs, Mingliang Liu and extra especially Aaron Fabbri @ cloudera. It's a big piece of code to learn. First time a patch of mine has ever crossed the 1MB source barrier</li>
</ol>
I now understand a lot about commit protocols in Hadoop and Spark, including the history of interesting failures encountered, events which are reflected in the change logs of the relevant classes. Things you never knew about the Hadoop MapReduce commit protocol <br />
<ol>
<li>The two different algorithms, v1 and v2 have very different semantics about the atomicity of task and job commits, including when output becomes visible in the destination directory.</li>
<li>Neither algorithm is atomic in both task and job commit.</li>
<li>V1 is atomic in task commits, but O(files) in its non-atomic job commit. It can recover from any job failure without having rerun all succeeded tasks, but not from a failure in job commit.</li>
<li>V2's job commit is a repeatable atomic O(1) operation, because it is a no-op. Task commits do the move/merge, which are O(file), make the output immediately visible, and as a consequence, mean that failure of a job means the output directory is in an unknown state.</li>
<li>Both algorithms depend on the filesystem having consistent listings and Create/Update/Delete operations</li>
<li>The routine to merge the output of a task to the destination is a real-world example of a co-recursive algorithm. These are so rare most developers don't even know the term for them -or have forgotten it. </li>
<li>At-most-once execution is guaranteed by having the tasks and AM failing when they recognise that they are in trouble.</li>
<li>The App Master refuses to commit a job if it hasn't had a heartbeat with the YARN Resource Manager within a specific time period. This stops it committing work if the network is partitioned and the AM/RM protocol fails...YARN may have considered the job dead and restarted it.</li>
<li>tasks commit iff they get permission from the AM; thus they will not attempt to commit if the network partitions.</li>
<li>if a task given permission to commit does not report a successful commit to the AM; the V1 algorithm can rerun the task; v2 must conclude its in an unknown state and abort the job. </li>
<li>Spark can commit using the Hadoop FileOutputCommitter; its Parquet support has some "special" code which refuses to work if the committer is not a subclass of ParquetOutputCommitter</li>
. That is: it's special code makes it the hardest thing to bind to this: ORC, CSV, Avro: they all work out the box.
<li>It adds the ability for tasks to provide extra data to its job driver for use in job commit; this allows committers to explicitly pass commit information directly to the driver, rather than indirectly via the (consistent) filesystem. </li>
<li>Everyone's code assumes that abort() completes in a bounded time, and does not ever throw that IOException its signature promises it can.</li>
<li>There's lots of cruft in the MRv2 codebase to keep the MRv1 code alive, which would be really good to delete</li>
</ol>
This means I get to argue the semantics of commit algorithms with people, as I know what the runtimes "really do", rather than believed by everyone who has neither implemented part of it or stepped throught the code in a debugger.<br />
<br />
If we had some TLA+ specifications of filesystems and object stores, we could perhaps write the algorithms as PlusCal examples, but that needs someone with the skills and the time. I'd have to find the time to learn TLA+ properly as well as specify everything, so it won't be me.<br />
<br />
Returning to the committers, what do they do which is so special?<br />
<br />
<i>They upload task output to the final destination paths in the tasks, but don't make the uploads visible until the job is committed</i>.<br />
<br />
No renames, no copies, no job-commit-time merges, <i>and no data visible until job commit</i>. Tasks which fail/fail to commit do not have any adverse side effects on the destination directories. <br />
<br />
First, read <a href="https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committer_architecture.md">S3A Committers: Architecture and Implementation</a>.<br />
<br />
Then, if that seems interesting <a href="https://github.com/apache/hadoop/tree/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit">look at the source</a>.<br />
<br />
A key feature is that we've snuck in to FileOutputFormat a mechanism to allow you to provide different committers for different filesystem schemas.<br />
<br />
Normal file output formats (i.e. not Parquet) will automatically get the committer for the target filesystems, which, for S3A, can be changed from the default FileOutputCommitter to an S3A-specific one. And any other object store which also offers delayed materialization of uploaded data can implement their own and run it alongside the S3A ones, which will be something to keep the Azure, GCS and openstack teams busy, perhaps.<br />
<br />
For now though: users of Hadoop can use Amazon S3 (or compatible services) as the direct destination of Hadoop and Spark workloads without any overheads of copying the data, and the ability to support failure recovery and speculative execution. I'm happy with that as a good first step.<br />
<br />
(photo: street vendors at the Kenya/Tanzania Border) Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-9421376913481449772017-11-06T23:30:00.000+00:002017-11-06T23:38:39.664+00:00I do not fear Kerberos, but I do fear Apple Itunes billingI laugh at Kerberos messages. When I see a stack trace with a meaningless network error I go "that's interesting". I even learned PowerShell in a morning to fix where I'd managed to break our Windows build and tests.<br />
<br />
But there is now one piece of software I do not ever want to approach, ever again. Apple icloud billing.<br />
<br />
So far, since Saturday's warnings on my phone telling me that there was a billing problem<br />
<ol>
<li>Tried and repeatedly failed to update my card details</li>
<li>Had my VISA card seemingly blocked by my bank,</li>
<li>Been locked out of our Netflix subscription on account of them failing to bill a card which has been locked out by my may</li>
<li>Had a chat line with someone on Apple online, who finally told me to phone an 800 number.</li>
<li>Who are closed until office hours tomorrow</li>
</ol>
What am I trying to do? Set up iCloud family storage so I get a full resolution copy of my pics shared across devices, also give the other two members of our household lots of storage.<br />
<br />
What have I achieved? Apart from a card lockout and loss of Netflix, nothing.<br />
<br />
If this was a work problem I'd be loading debug level log files oftens of GB in editors, using regexps to delete all lines of noise, then trying to work backwards from the first stack trace in one process to where something in another system went awry. Not here though here I'm thinking "I don't need this". So if I don't get this sorted out by the end of the week, I won't be. I will have been defeated.<br />
<br />
Last month I opted to pay £7/month for 2TB of iCloud storage. This not only looked great value for 2TB of storage, the fact I could share it with the rest of the family meant that we got a very good deal for all that data. And, with integration with iphotos, I could use to upload all my full resolution pictures. So sign up I did<br />
<br />
My card is actually bonded to Bina's account, but here I set up the storage, had to reenter it. Where the fact that the dropdown menu switched to finnish was most amusing<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/26391538699/in/album-72157633427965316/" title="finnish"><img alt="finnish" height="32" src="https://farm5.staticflickr.com/4521/26391538699_d8732f84b9_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
With hindsight I should have taken "billing setup page cannot maintain consistency of locales between UI, known region of user, and menus" as a warning sign that something was broken.<br />
<br />
Other than that, everything seemed to work. Photo upload working well. I don't yet keep my full photoset managed by iPhotos; it's long been a partitionedBy(year, month) directory tree built up with the now unmaintained Picasa, backed up at full res to our home server, at lower res to google photos. The iCloud experience seemed to be going smoothly; smoothly enough to think about the logistics of a full photo import. One factor there <a href="https://github.com/ndbroadbent/icloud_photos_downloader">iCloud photos downloader</a> works great as a way of downloading the full res images into the year/month layout, so I can pull images over to the server, so giving me backup and exit strategies. <br />
<br />
That was on the Friday. On the Saturday a little alert pops up on the phone, matched by an email<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/38192123572/in/album-72157633427965316/" title="Apple "we will take away all your photos""><img alt="Apple "we will take away all your photos"" height="244" src="https://farm5.staticflickr.com/4476/38192123572_78972200ce_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
Something has gone wrong. Well, no problem, over to billing. First, the phone UI. A couple of attempts and no, no joy. Over to the web page<br />
<br />
Ths time, the menus are in german<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/24308535618/in/album-72157633427965316/" title="appleID can't handle payment updates"><img alt="appleID can't handle payment updates" height="167" src="https://farm5.staticflickr.com/4456/24308535618_0274eb1fa7_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
"Something didn't work but we don't know what". Nice. Again? Same message.<br />
<br />
Never mind, I recognise "PayPal" in german, lets try that:<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/38129112452/in/album-72157633427965316/" title="And they can't handle paypal"><img alt="And they can't handle paypal" height="163" src="https://farm5.staticflickr.com/4493/38129112452_345475a0d2_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
No: failure. <br />
<br />
Next attempt: use my Visa credit card, not the bank debit card I normally use. This *appears* to take. At least, I haven't got any more emails, and the photos haven't been deleted. All well to the limits of my observability.<br />
<br />
Except, guess what ends up in my inbox instead? Netflix complaining about billing<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/38193547892/in/dateposted/" title="Netflix "there was a problem""><img alt="Netflix "there was a problem"" height="264" src="https://farm5.staticflickr.com/4540/38193547892_3f565eeeb4_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
Hypothesis: repeated failures of apple billing to set things up have caused the bank to lock down the card, it just so happens that Netflix bill the same day (does everyone do the first few days of each month?), and so: blocked off. That is, Apple Billing's issues are sufficient to break Netflix.<br />
<br />
Over to the bank, review transactions, drop them a note. <br />
<br />
My bank is fairly secure and uses 2FA with a chip-and-pin card inserted into a portable card reader. You can log in without it, but then cannot set up transfers to any new destination. I normally use the card reader and card. Not today though, signatures aren't being accepted. Solution, fall back to the "secrets" and then compose a message<br />
<br />
Except of course, the first time I try that, it fails<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/37451972454" title="And I can't talk to my bank about it"><img alt="And I can't talk to my bank about it" height="96" src="https://farm5.staticflickr.com/4556/37451972454_18108b7b07_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
This is not a good day. Why can't I just have "Unknown failure at GSS API level". That I can handle. Instead what I am seeing here is a cross-service outage choreographed by Apple, which, if it really does take away my photos, will even go into my devices.<br />
<br />
Solution: log out, log in. Compose the message in a text editor for ease of resubmission. Paste and submit. Off it goes.<br />
<br />
Sunday: don't go near a computer. Phone still got a red marker "billing issues", though I can't distinguish from "old billing issues" from new billing issues. That is: no email to say things are fixed. At the same time, no emails to say "things are still broken". Same from netflix, neither a success message, or a failure one. Nothing from the bank either.<br />
<br />
Monday: not worrying about this while working. No Kerberos errors there either. Today is a good day, apart from the thermostat on the ground floor not sending "turn the heating" on messages to the boiler, even after swapping the batteries.<br />
<br />
After dinner, netflix. Except the TV has been logged out. Log in to netflix on the web and yes, my card is still not valid. Go to the bank, no response there yet. Go back to netflix, insert Visa credit card: its happy. This is good, as if this card started failing too, I'd be running out of functional payment mechanisms. <br />
<br />
Now, what about apple?<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/26446633409/in/album-72157633427965316/" title="apple id payment method; chrome"><img alt="apple id payment method; chrome" height="128" src="https://farm5.staticflickr.com/4452/26446633409_6109eed7f6.jpg" width="354" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
No, not english, or indeed, any language I know how to read. What now?<br />
<br />
Apple support, in the form of a chat<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/38223256411/in/album-72157633427965316/" title="Screen Shot 2017-11-06 at 21.10.51"><img alt="Screen Shot 2017-11-06 at 21.10.51" height="640" src="https://farm5.staticflickr.com/4546/38223256411_fa363f2a32_z.jpg" width="333" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
After a couple of minutes wait I as talking to someone. I was a bit worried that the person I'm talking to was "allen". I know Allen. Sometimes he's helpful. Let's see. <br />
<br />
After explaining my problem and sharing my appleId, Allen had a solution immediately: only the nominated owner of the family account can do the payment, even if the icloud storage account is in the name of another. So log in as them and try and sort stuff out there.<br />
<br />
So: log out as me, long in as B., edit the billing. Which is the same card I've been using. Somehow, things went so wrong with Amazon billing trying to charge the system off my user ID and failing that I've been blocked everywhere. Solution: over to the VISA credit card. All "seems" well.<br />
<br />
But how can I be sure? I've not got any emails from Apple Billing. The little alert in the settings window is gone, but I don't trust it. Without notification from Apple confirming that all is well, I have to assume that things are utterly broken. How can I trust a billing system which has managed to lock me out of my banking or netflix? <br />
<br />
I raised this topic with Allen. After a bit of backwards and forwards, he gave me an 800 number to call. Which I did. They are closed after 19:00 hours, so I'll have to wait until tomorrow. I shall be calling them. I shall also be in touch with my bank.<br />
<br />
Overall: this has been, so far, an utter disaster. Its not just that the system suffers from broken details (prompts in random languages), and deeply broken back ends (whose card is charged), but it manages to escalate the problem to transitively block out other parts of my online life.<br />
<br />
If everything works tomorrow, I'll treat this as a transient disaster. If, on the other hand, things are not working tomorrow, I'm going to give up trying to maintain an iCloud storage account. I'll come up with some other solution. I just can't face having the billing system destroy the rest of my life.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-76863944592756232352017-10-23T18:09:00.001+01:002017-10-23T20:08:28.081+01:00ROCA breaks my commit processSince January I've been signing my git commits to the main Hadoop branches; along with Akira, Aaron and Allen we've been leading the way in trying to be more rigorous about authenticating our artifacts, in that gradual (and clearly futile) process to have some form of defensible INFOSEC policy on the development laptop (and ignoring homebrew, maven, sbt artifact installation,...).<br />
<br />
For extra rigorousness, I've been using a <a href="https://www.yubico.com/product/yubikey-4-series/">Yubikey 4</a> for the signing: I don't have the secret key on my laptop *at all*, just the revocation secrets. To sign work, I use "git commit -S", the first commit of the day asks me to press the button and enter a PIN, from then on all I have to do to sign a commit is just press the button on the dongle plugged into a USB port on the monitor. Simple, seamless signing.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/32504332601/in/album-72157667789976296/" title="Yubikey rollout"><img alt="Yubikey rollout" height="480" src="https://farm1.staticflickr.com/450/32504332601_e337fe9607_z.jpg" width="640" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
Until Monday October 16, 2017.<br />
<br />
There was some news in the morning about a WPA2 vulnerability. I looked at the summary and opted not to worry; the patch status of consumer electronics on the WLAN is more significant a risk than the WiFI password. No problem there, more a moral "never trust a network". As for a hinted at RSA vulnerability, it was going to inevitably be of two forms "utter disaster there's no point worrying about" or "hypothetical and irrelevant to most of us." Which is where I wasn't quite right.<br />
<br />
Later on in the afternoon, glancing at fb on the phone and what I should I see but a message from facebook.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/23884904848/in/album-72157667789976296/" title="Your OpenPGP public key is weak. Please revoke it and generate a replacement"><img alt="Your OpenPGP public key is weak. Please revoke it and generate a replacement" height="68" src="https://farm5.staticflickr.com/4461/23884904848_2c638156bc_o.png" width="420" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
"Your OpenPGP public key is weak. Please revoke it and generate a replacement"<br />
<br />
That's not the usual FB message. I go over to a laptop, log in to facebook and look at my settings: yes, I've pasted the public key in there. Not because I want encrypted comms with FB, but so that people can see if they really want to; part of my "publish the key broadly" program, as I've been trying to cross-sign/cross-trust other ASF committers' keys.<br />
<br />
Then over to twitter and computing news sites and yes, there is <a href="https://arstechnica.com/information-technology/2017/10/crypto-failure-cripples-millions-of-high-security-keys-750k-estonian-ids/">a bug in a GPG keygen library used in SoC parts, from Estonian ID cards to Yubikeys like mine</a>. And as a result, it is possible for someone to take my public key and generate the private one. While the vulnerability is public, the exact algorithm to regenerate the private key isn't so I have a bit of time left to kill my key. Which I do, and place an order for a replacement key (which has arrived)<br />
<br />
And here's the problem. Git treats the revocation of a key as a sign that every single signature most now be untrusted.<br />
<br />
Before, a one commit-per-line log of branch-2 --show-signature<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/37042315494/in/dateposted/" title="git log --1show-signatures branch-2"><img alt="git log --1show-signatures branch-2" height="436" src="https://farm5.staticflickr.com/4492/37042315494_f0f2ba8d6b_z.jpg" width="640" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
After<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/37493124630/in/dateposted/" title="git log1 --show-signature after revocation picked up"><img alt="git log1 --show-signature after revocation picked up" height="436" src="https://farm5.staticflickr.com/4498/37493124630_dfb42190f5_z.jpg" width="640" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
You see the difference? All my commits are now considered suspect. Anyone doing a log --show-signature will now actually get more warnings about the commits I signed than all those commits by other people which are not signed at all. Even worse, if someone were to ever try to do a full validation of the commit path at any time in the future is now going to see this. For the entire history of the git repo, those commits of mine are going to show up untrusted. <br />
<br />
Given the way that git overreacts to key revocation, I didn't do this right.<br />
<br />
What I should have done is simpler: force-expired the key by changing its expiry date the current date/time and pushing up the updated public key to the servers. As people update their keytabs from the servers, they'll see that the key isn't valid for signing new data, but that all commits issued by the user are marked as valid-at-the-time-but-with-an-expired-key. Key revocation would be reserved for the real emergency, "someone has my key and is actively using it".<br />
<br />
I now have a new key, and will be rolling it out, This time I'm thinking ofr rolling my signing key every year, so that if I ever do have to revoke a key, it's only the last year's worth of commits which will be invalidated. Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-46280208178645686122017-09-09T20:46:00.001+01:002017-09-09T20:46:38.154+01:00Stocator: A High Performance Object Store Connector for Spark<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/36789495506/in/dateposted/" title="Behind Picton Street"><img alt="Behind Picton Street" height="375" src="https://farm5.staticflickr.com/4379/36789495506_aa71e5b3a7.jpg" width="500" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
IBM have published <a href="https://arxiv.org/abs/1709.01812">a lovely paper on their Stocator 0-rename committer for Spark</a><br />
<br />
Stocator is: <br />
<ol>
<li>An extended Swift client</li>
<li>magic in their FS to redirect mkdir and file PUT/HEAD/GET calls under the normal MRv1 __temporary paths to new paths in the dest dir</li>
<li>generating dest/part-0000 filenames using the attempt & task attempt ID to guarantee uniqueness and to ease cleanup: restarted jobs can delete the old attempts</li>
<li>Commit performance comes from eliminating the COPY, which is O(data),</li>
<li>And from tuning back the number of HTTP requests (probes for directories, mkdir 0 byte entries, deleting them)</li>
<li>Failure recovery comes from explicit names of output files. (note, avoiding any saving of shuffle files, which this wouldn't work with...spark can do that in memory)</li>
<li>They add summary data in the _SUCCESS file to list the files written & so work out what happened (though they don't actually use this data, instead relying on their swift service offering list consistency). (I've been doing something similar, primarily for testing & collection of statistics).</li>
</ol>
<br />
Page 10 has their benchmarks, all; of which are against an IBM storage system, not real amazon S3 with its different latencies and performance.<br />
<br />
<i>Table 5: Average run time</i><br />
<br />
<br />
<table><colgroup><col></col><col></col><col></col><col></col><col></col><col></col><col></col><col></col></colgroup><tbody>
<tr><td td=""></td><td><div>
<div class="column">
Read-Only 50GB</div>
</div>
</td><td><div>
<div class="column">
Read-Only 500GB</div>
</div>
</td><td><div>
<div class="column">
Teragen</div>
</div>
</td><td><div>
<div class="column">
Copy</div>
</div>
</td><td><div>
<div class="column">
Wordcount</div>
</div>
</td><td><div>
<div class="column">
Terasort</div>
</div>
</td><td><div>
<div class="column">
TPC-DS</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
Hadoop-Swift Base</div>
</div>
</td><td><div>
<div class="column">
37.80±0.48</div>
</div>
</td><td><div>
<div class="column">
393.10±0.92</div>
</div>
</td><td><div>
<div class="column">
624.60±4.00</div>
</div>
</td><td><div>
<div class="column">
622.10±13.52</div>
</div>
</td><td><div>
<div class="column">
244.10±17.72</div>
</div>
</td><td><div>
<div class="column">
681.90±6.10</div>
</div>
</td><td><div>
<div class="column">
101.50±1.50</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
S3a Base</div>
</div>
</td><td><div>
<div class="column">
33.30±0.42</div>
</div>
</td><td><div>
<div class="column">
254.80±4.00</div>
</div>
</td><td><div>
<div class="column">
699.50±8.40</div>
</div>
</td><td><div>
<div class="column">
705.10±8.50</div>
</div>
</td><td><div>
<div class="column">
193.50±1.80</div>
</div>
</td><td><div>
<div class="column">
746.00±7.20</div>
</div>
</td><td><div>
<div class="column">
104.50±2.20</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
Stocator</div>
</div>
</td><td><div>
<div class="column">
34.60±0.56</div>
</div>
</td><td><div>
<div class="column">
254.10±5.12</div>
</div>
</td><td><div>
<div class="column">
38.80±1.40</div>
</div>
</td><td><div>
<div class="column">
68.20±0.80</div>
</div>
</td><td><div>
<div class="column">
106.60±1.40</div>
</div>
</td><td><div>
<div class="column">
84.20±2.04</div>
</div>
</td><td><div>
<div class="column">
111.40±1.68</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
Hadoop-Swift Cv2</div>
</div>
</td><td><div>
<div class="column">
37.10±0.54</div>
</div>
</td><td><div>
<div class="column">
395.00±0.80</div>
</div>
</td><td><div>
<div class="column">
171.30±6.36</div>
</div>
</td><td><div>
<div class="column">
175.20±6.40</div>
</div>
</td><td><div>
<div class="column">
166.90±2.06</div>
</div>
</td><td><div>
<div class="column">
222.70±7.30</div>
</div>
</td><td><div>
<div class="column">
102.30±1.16</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
S3a Cv2</div>
</div>
</td><td><div>
<div class="column">
35.30±0.70</div>
</div>
</td><td><div>
<div class="column">
255.10±5.52</div>
</div>
</td><td><div>
<div class="column">
169.70±4.64</div>
</div>
</td><td><div>
<div class="column">
185.40±7.00</div>
</div>
</td><td><div>
<div class="column">
111.90±2.08</div>
</div>
</td><td><div>
<div class="column">
221.90±6.66</div>
</div>
</td><td><div>
<div class="column">
104.00±2.20</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
S3a Cv2 + FU</div>
</div>
</td><td><div>
<div class="column">
35.20±0.48</div>
</div>
</td><td><div>
<div class="column">
254.20±5.04</div>
</div>
</td><td><div>
<div class="column">
56.80±1.04</div>
</div>
</td><td><div>
<div class="column">
86.50±1.00</div>
</div>
</td><td><div>
<div class="column">
112.00±2.40</div>
</div>
</td><td><div>
<div class="column">
105.20±3.28</div>
</div>
</td><td><div>
<div class="column">
103.10±2.14</div>
</div>
</td></tr>
</tbody></table>
<br />
The S3a is the 2.7.x version, which has the stabilisation enough to be usable with Thomas Demoor's fast output stream (<a href="https://issues.apache.org/jira/browse/HADOOP-11183">HADOOP-11183</a>). That stream buffers in RAM & initiates the multipart upload once the block size threshold is reached. Provided you can upload data faster than you run out of RAM, it avoids the log waits at the end of close() calls, so has significant speedup. (The fast output stream has evolved into the S3ABlockOutput Stream <a href="https://issues.apache.org/jira/browse/HADOOP-13560">(HADOOP-13560</a>) which can buffer off heap and to HDD, and which will become the sole output stream once the great cruft cull of <a href="https://issues.apache.org/jira/browse/HADOOP-14738">HADOOP-14738</a> goes in)<br />
<br />
That means in the doc, "FU" == fast upload, == incremental upload & RAM storage. The default for S3A will become HDD storage, as unless you have a very fast pipe to a compatible S3 store, it's easy to overload the memory<br />
<br />
Cv2 means MRv2 committer, the one which does single rename operation on task commit (here the COPY), rather than one in task commit to promote that attempt, and then another in job commit to finalise the entire job. So only: one copy of every byte PUT, rather than 2, and the COPY calls can run in parallel, often off the critical path<br />
<br />
<i> Table 6: Workload speedups when using Stocator</i><br />
<br />
<br />
<table><colgroup><col></col><col></col><col></col><col></col><col></col><col></col><col></col><col></col></colgroup><tbody>
<tr><td><br /></td><td><div>
<div class="column">
Read-Only 50GB</div>
</div>
</td><td><div>
<div class="column">
Read-Only 500GB</div>
</div>
</td><td><div>
<div class="column">
Teragen</div>
</div>
</td><td><div>
<div class="column">
Copy</div>
</div>
</td><td><div>
<div class="column">
Wordcount</div>
</div>
</td><td><div>
<div class="column">
Terasort</div>
</div>
</td><td><div>
<div class="column">
TPC-DS</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
Hadoop-Swift Base</div>
</div>
</td><td><div>
<div class="column">
x1.09</div>
</div>
</td><td><div>
<div class="column">
x1.55</div>
</div>
</td><td><div>
<div class="column">
x16.09</div>
</div>
</td><td><div>
<div class="column">
x9.12</div>
</div>
</td><td><div>
<div class="column">
x2.29</div>
</div>
</td><td><div>
<div class="column">
x8.10</div>
</div>
</td><td><div>
<div class="column">
x0.91</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
S3a Base</div>
</div>
</td><td><div>
<div class="column">
x0.96</div>
</div>
</td><td><div>
<div class="column">
x1.00</div>
</div>
</td><td><div>
<div class="column">
x18.03</div>
</div>
</td><td><div>
<div class="column">
x10.33</div>
</div>
</td><td><div>
<div class="column">
x1.82</div>
</div>
</td><td><div>
<div class="column">
x8.86</div>
</div>
</td><td><div>
<div class="column">
x0.94</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
Stocator</div>
</div>
</td><td><div>
<div class="column">
x1</div>
</div>
</td><td><div>
<div class="column">
x1</div>
</div>
</td><td><div>
<div class="column">
x1</div>
</div>
</td><td><div>
<div class="column">
x1</div>
</div>
</td><td><div>
<div class="column">
x1</div>
</div>
</td><td><div>
<div class="column">
x1</div>
</div>
</td><td><div>
<div class="column">
x1</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
Hadoop-Swift Cv2</div>
</div>
</td><td><div>
<div class="column">
x1.07</div>
</div>
</td><td><div>
<div class="column">
x1.55</div>
</div>
</td><td><div>
<div class="column">
x4.41</div>
</div>
</td><td><div>
<div class="column">
x2.57</div>
</div>
</td><td><div>
<div class="column">
x1.57</div>
</div>
</td><td><div>
<div class="column">
x2.64</div>
</div>
</td><td><div>
<div class="column">
x0.92</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
S3a Cv2</div>
</div>
</td><td><div>
<div class="column">
x1.02</div>
</div>
</td><td><div>
<div class="column">
x1.00</div>
</div>
</td><td><div>
<div class="column">
x4.37</div>
</div>
</td><td><div>
<div class="column">
x2.72</div>
</div>
</td><td><div>
<div class="column">
x1.05</div>
</div>
</td><td><div>
<div class="column">
x2.64</div>
</div>
</td><td><div>
<div class="column">
x0.93</div>
</div>
</td></tr>
<tr><td><div>
<div class="column">
S3a Cv2 + FU</div>
</div>
</td><td><div>
<div class="column">
x1.02</div>
</div>
</td><td><div>
<div class="column">
x1.00</div>
</div>
</td><td><div>
<div class="column">
x1.46</div>
</div>
</td><td><div>
<div class="column">
x1.27</div>
</div>
</td><td><div>
<div class="column">
x1.05</div>
</div>
</td><td><div>
<div class="column">
x1.25</div>
</div>
</td><td><div>
<div class="column">
x0.93</div>
</div>
</td></tr>
</tbody></table>
<br />
<br />
Their TCP-DS benchmarks show that stocator & swift is slower than TCP-DS Hadoop 2.7 S3a + Fast upload & MRv2 commit. Which means that (a) the Hadoop swift connector is pretty underperforming and (b) with fadvise=random and columnar data (ORC, Parquet) that speedup alone will give better numbers than swift & stocator. (Also shows how much the TCP-DS Benchmarks are IO heavy rather than output heavy the way the tera-x benchmarks are).<br />
<br />
As the co-author of <a href="https://issues.apache.org/jira/browse/HADOOP-8545">that original swift connector</a> then, what the IBM paper is saying is "our zero rename commit just about compensates for the functional but utterly underperformant code Steve wrote in 2013 and gives us equivalent numbers to 2016 FS connectors by Steve and others, before they <a href="https://issues.apache.org/jira/browse/HADOOP-11694">started the serious work on S3A speedup</a>". Oh, and we used some of Steve's code to test it, <a href="https://github.com/SparkTC/stocator/issues/144">removing the ASF headers.</a> <br />
<br />
Note that as the IBM endpoint is neither the classic python Openstack swift or Amazon's real S3, it won't exhibit the real issues those two have. Swift has the worst update inconsistency I've ever seen (i.e repeatable whenever I overwrote a large file with a smaller one), and aggressive throttling even of the DELETE calls in test teardown. AWS S3 has its own issues, not just in list inconsistency, but serious latency of HEAD/GET requests, as they always go through the S3 load balancers. That is, I would hope that IBM's storage offers significantly better numbers than <a href="https://steveloughran.blogspot.co.uk/2016/12/how-long-does-filesystemexists-take.html">you get over long-haul S3 connections</a>. Although it'd be hard (impossible) to do a consistent test there, I 'd fear in-EC2 performance numbers to be actually worse than that measures.<br />
<br />
I might post something faulting the paper, but maybe I'll should to do a benchmark of my new committer first. For now though, my critique of both the swift:// and s3a:// clients is as follows<br />
<br />
<i>Unless the storage services guarantees consistency of listing along with other operations, you can't use any of the MR commit algorithms to reliably commit work. So performance is moot. Here IBM do have a consistent store, so you can start to look at performance rather than just functionality. And as they note, committers which work with object store semantics are the way to do this: for operations like this you need the atomic operations of the store, not mocked operations in the client.</i><br />
<br />
People who complain about the performance of using swift or s3a as a destination are blisfully unaware of the key issue: the risk of data loss due inconsistencies. Stocator solves both issues at once.<br />
<br />
Anyway, means we should be planning a paper or two on our work too, maybe even start by doing something about random IO and object storage, as in "what can you do for and in columnar storage formats to make them work better in a world where a seek()+ read is potentially a new HTTP request."<br />
<br />
(picture: parakeet behind Picton Street)<br />
<br />
<br />
<br />
<br />
<br />
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-17089104068218642062017-05-22T23:51:00.001+01:002017-05-22T23:52:21.509+01:00Dissent is a right: Dissent is a duty. @DissidentbotIt looks like the Russians interfered with the US elections, not just from the alleged publishing of the stolen emails, or through the alleged close links with the Trump campaign, but in the social networks, creating astroturfed campaigns and repeating the messages the country deemed important.<br />
<br />
Now the UK is having an election. And no doubt the bots will be out. But if the Russians can do bots: so can I.<br />
<br />
This then, is <a href="https://twitter.com/dissidentbot">@dissidentbot</a>.<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/34791041426/in/photostream/" title="Dissidentbot"><img alt="Dissidentbot" height="375" src="https://c1.staticflickr.com/5/4245/34791041426_24ab01d338.jpg" width="500" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
Dissident bot is a Raspbery Pi running <a href="https://github.com/steveloughran/dissident/blob/master/dissident.rb">a 350 line ruby script</a> tasked with heckling politicans<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/34698678581/in/dateposted/" title="unrelated comments seem to work, if timely"><img alt="unrelated comments seem to work, if timely" height="475" src="https://c1.staticflickr.com/5/4194/34698678581_bf3cdc9630_o.png" width="587" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
It offers:<br />
<ul>
<li>The ability to listen to tweets from a number of sources: currently a few UK politicians</li>
<li>To respond pick a random responses from a set of replies written explicitly for each one</li>
<li>To tweet the reply after a 20-60s sleep.</li>
<li>Admin CLI over Twitter Direct Messaging</li>
<li>Live update of response sets via github.</li>
<li>Live add/remove of new targets (just follow/unfollow from the twitter UI)</li>
<li>Ability to assign a probability of replying, 0-100</li>
<li>Random response to anyone tweeting about it when that is not a reply (disabled due to issues)</li>
<li>Good PUE numbers, being powered off the USB port of the wifi base station, SSD storage and fanless naturally cooled DC. Oh, and we're generating a lot of solar right now, so zero-CO2 for half the day.</li>
</ul>
It's the first Ruby script of more than ten lines I've ever written; interesting experience, and I've now got three chapters into a copy of the Pickaxe Book I've had sitting unloved alongside "ML for the working programmer". It's nice to be able to develop just by saving the file & reloading it in the interpreter...not done that since I was Prolog programming. Refreshing.<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/34020814323/in/dateposted/" title="Strong and Stable my arse"><img alt="Strong and Stable my arse" height="143" src="https://c1.staticflickr.com/5/4167/34020814323_c30ae7e59f_o.png" width="415" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
Without type checking its easy to ship code that's broken. I know, that's what tests are meant to find, but as this all depends on the live twitter APIs, it'd take effort, including maybe some split between Model and Control. Instead: broken the code into little methods I can run in the CLI.<br />
<br />
As usual, the real problems surface once you go live:<br />
<ol>
<li>The bot kept failing overnight; nothing in the logs. Cause: its powered by the router and DD-WRT was set to reboot every night. Fix: disable.</li>
<li>It's "reply to any reference which isn't a reply itself" doesn't work right. I think it's partly RT related, but not fully tracked it down.</li>
<li>Although it can do a live update of the dissident.rb script, it's not yet restarting: I need to ssh in for that.</li>
<li>I've been testing it by tweeting things myself, so I've been having to tweet random things during testing.</li>
<li>Had to add handling of twitter blocking from too many API calls. Again: sleep a bit before retries.</li>
<li>It's been blocked by the conservative party. That was because they've been tweeting 2-4 times/hour, and dissidentbot originally didn't have any jitter/sleep. After 24h of replying with 5s of their tweets, it's blocked.</li>
</ol>
The loopback code is the most annoying bug; nothing too serious though.<br />
<br />
The DM CLI is nice, the fact that I haven't got live restart something which interferes with the workflow.<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/33988259474/in/dateposted/" title="Dissidentbot CLI via Twitter DM"><img alt="Dissidentbot CLI via Twitter DM" height="620" src="https://c1.staticflickr.com/5/4176/33988259474_60a8f85110_o.png" width="329" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
Because the Pi is behind the firewall, I've no off-prem SSH access.<br />
<br />
The fact the conservatives have blocked me, that's just amusing. I'll need another account.<br />
<br />
One of the most amusing things is people argue with the bot. Even with "bot" in the name, a profile saying "a raspberry pi", people argue.<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/34831767585/in/dateposted/" title="Arguing with Bots and losing"><img alt="Arguing with Bots and losing" height="723" src="https://c1.staticflickr.com/5/4227/34831767585_dbe285cf40_o.png" width="494" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
Overall the big barrier is content. It turns out that you don't need to do anything clever about string matching to select the right tweet: random heckles seems to blend in. That's probably a metric of political debate in social media: a 350 line ruby script tweeting random phrases from a limited set is indistinguishable from humans.<br />
<br />
I will accept Pull Requests of <a href="https://github.com/steveloughran/dissident/tree/master/data">new content</a>. Also: people are free to deploy their own copies. without the self.txt file it won't reply to any random mentions, just listen to its followed accounts and reply to those with a matching file in the data dir.<br />
<br />
If the Russians can do it, so can we.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-55628646273596574072017-05-15T10:47:00.001+01:002017-05-15T10:47:37.021+01:00The NHS gets 0wnedFriday's news was full of breaking panic about <a href="http://www.bbc.co.uk/news/health-39899646">an "attack" on the NHS</a>, making it sound like someone had deliberately made an attempt to get in there and cause damage.<br />
<br />
It turns out that it wasn't an attack against the NHS itself, just a wide scale ransomware attack which combined click-through installation and intranet propagation by way of a vulnerability which the NSA had kept for internal use for some time.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/34498204122/in/dateposted/" title="Laptops, Lan ports and SICP"><img alt="Laptops, Lan ports and SICP" height="375" src="https://c1.staticflickr.com/5/4158/34498204122_344d674677.jpg" width="500" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
The NHS got decimated for a combination of issues:<br />
<ol>
<li>A massive intranet for SMB worms to run free.</li>
<li>Clearly, lots of servers/desktops running the SMB protocol.</li>
<li>One or more people reading an email with the original attack, bootstrapping the payload into the network.</li>
<li>A tangible portion of the machines within some parts of the network running unpatched versions of Windows, clearly caused in part by the failure of successive governments to fund a replacement program while not paying MSFT for long-term support.</li>
<li>Some of these systems within part of medical machines: MRI scanners, VO2 test systems, CAT scanners, whatever they use in the radiology dept —to name but some of the NHS machines I've been through in the past five years.<br />
</li>
</ol>
The overall combination then is: a large network/set of networks with unsecured, unpatched targets were vulnerable to a drive-by attack, the kind of attack, which, unlike a nation state itself, you <i>may</i> stand a chance of actually defending against. <br />
<br />
What went wrong?<br />
<br />
Issue 1: The intranet. Topic for another post.<br />
<br />
Issue 2: SMB.<br />
<br />
In servers this can be justified, though it's a shame that SMB sucks as a protocol. Desktops? It's that eternal problem: these things get stuck in as "features", but which sometimes come to burn you. Every process listening on a TCP or UDP port is a potential attack point. A 'netstat -a" will list running vulnerabilities on your system; enumerating running services "COM+, Sane.d? mDNS, ..." which you should review and decide whether they could be halted. Not that you can turn mDNS off on a macbook...<br />
<br />
Issue 3: Email<br />
<br />
With many staff, email clickthrough is a function of scale and probability: someone will, eventually. Probability always wins. <br />
<br />
Issue 4: The unpatched XP boxes.<br />
<br />
This is why Jeremy Hunt is in hiding, but it's also why our last Home Secretary, tasked with defending the nation's critical infrastructure, might want to avoid answering questions. Not that she is answering questions right now.<br />
<br />
Finally, 5: The medical systems.<br />
<br />
This is a complication on the "patch everything" story because every update to a server needs to be requalified. Why? <a href="https://en.wikipedia.org/wiki/Therac-25">Therac-25</a>.<br />
<br />
What's critical here is that the NHS was 0wned, not by some malicious nation state or dedicated hacker group: it fell victim to drive-by ransomware targeted at home users, small businesses, and anyone else with a weak INFOSEC policy This is the kind of thing that you do actually stand a chance of defending against, at least in the laptop, desktop and server.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/25783543395/in/album-72157667789976296/" title="Untitled"><img alt="Untitled" height="500" src="https://c1.staticflickr.com/2/1718/25783543395_c92732db09.jpg" width="375" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
Defending against malicious nation state is probably near-impossible given physical access to the NHS network is trivial: phone up at 4am complaining of chest pains and you get a bed with a LAN port alongside it and told to stay there until there's a free slot in the radiology clinic.<br />
<br />
What about the fact that the NSA had an exploit for the SMB vulnerability and were keeping quiet on it until the Shadow Brokers stuck up online? This is a complex issue & I don't know what the right answer is.<br />
<br />
Whenever critical security patches go out, people try and reverse engineer them to get an attack which will work against unpatched versions of: IE, Flash, Java, etc. The problems here were:<br />
<ul>
<li>the Shadow Broker upload included a functional exploit, </li>
<li>it was over the network to enable worms, </li>
<li>and it worked against widely deployed yet unsupported windows versions.</li>
</ul>
The cost of developing the exploit was reduced, and the target space vast, especially in a large organisation. Which, for a worm scanning and attacking vulnerable hosts, is a perfect breeding ground.<br />
<br />
If someone else had found and fixed the patch, there'd still have been exploits out against it -the published code just made it easier and reduced the interval between patch and live exploit<br />
<br />
The fact that it ran against an old windows version is also something which would have existed -unless MSFT were notified of the issue while they were still supporting WinXP. The disincentive for the NSA to disclose that is that a widely exploitable network attack is probably the equivalent of a strategic armament, one step below anything that can cut through a VPN and the routers, so getting you inside a network in the first place.<br />
<br />
The issues we need to look at are<br />
<ol>
<li>How long is it defensible to hold on to an exploit like this?</li>
<li>How to keep the exploit code secure during that period, while still using it when considered appropriate?</li>
</ol>
Here the MSFT "tomahawk" metaphor could be pushed a bit further. The US govt may have tomahawk missiles with nuclear payloads, but the ones they use are the low-damage conventional ones. That's what got out this time.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/27322814081/in/album-72157623050929013/" title="WMD in the Smithsonia"><img alt="WMD in the Smithsonia" height="379" src="https://c1.staticflickr.com/8/7520/27322814081_09bbdeea11.jpg" width="500" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
One thing that MSFT have to consider is: can they really continue with the "No more WinXP support" policy? I know they don't want to do it, the policy of making customers who care paying for the ongoing support is a fine way to do it, it's just it leaves multiple vulnerabilites. People at home, organisations without the money and who think "they won't be a target", and embedded systems everywhere -like a pub I visited last year whose cash registers were running Windows XP embedded; all those ATMs out there, etc, etc.<br />
<br />
<b>Windows XP systems are a de-facto part of the nation's critical infrastructure.</b><br />
<br />
Having the UK and US governments pay for patches for the NHS and <i>everyone else</i> could be a cost effective way of securing a portion of the national infrastructure, for the NHS and beyond.<br />
<br />
(Photos: me working on SICP during an unplanned five day stay and the Bristol Royal Infirmary. There's a LAN port above the bed I kept staring at; Windows XP Retail packaging, Smithsonian aerospace museum, the Mall, Washington DC) Unknownnoreply@blogger.com0Bristol, UK51.458693304480526 -2.59694331750483851.456219804480526 -2.601985817504838 51.461166804480527 -2.5919008175048379tag:blogger.com,1999:blog-3912869276826546304.post-38227659340383266192017-05-05T10:23:00.000+01:002017-05-05T10:23:37.303+01:00Is it time to fork Guava? Or rush towards Java 9?<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/32825128806/in/dateposted/" title="Lost Crew WiP"><img alt="Lost Crew WiP" height="500" src="https://c1.staticflickr.com/4/3954/32825128806_b12658b228.jpg" width="375" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
Guava problems have surfaced again.<br />
<br />
Hadoop 2.x has long-shipped Guava 14, though we have worked to ensure it runs against later versions, primarily by re-implementing our own classes of things pulled/moved across versions.<br />
<br />
<br />
Hadoop trunk has moved up to Guava 21.0, <a href="https://issues.apache.org/jira/browse/HADOOP-10101">HADOOP-10101</a>.This has gone and overloaded the Preconditions.checkState() method, such that: if you compile against Guava 21, your code doesn't link against older versions of Guava. I am so happy about this I could drink some more coffee.<br />
<br />
Classpaths are the gift that keeps on giving, and any bug report with the word "Guava" in it is inevitably going to be a mess. In contrast, Jackson is far more backwards compatible; the main problem there is getting every JAR in sync.<br />
<br />
What to do?<br />
<br />
<b>Shade Guava Everywhere</b><br />
This is going too be tricky to pull off. Andrew Wang has <a href="https://issues.apache.org/jira/browse/HADOOP-14284">taken on this task</a>. this is one of those low level engineering projects which doesn't have press-release benefits but which has the long-term potential to reduce pain. I'm glad someone else is doing it & will keep an eye on it.<br />
<br />
<b>Rush to use Java 9</b><br />
I am so looking forward to this from an engineering perspective:<br />
<br />
<b>Pull Guava out</b><br />
We could do our own Preconditions, our own VisibleForTesting attribute. More troublesome are the various cache classes, which do some nice things...hence they get used. That's a lot of engineering.<br />
<br />
<b>Fork Guava</b><br />
We'd have to keep up to date with all new Guava features, while reinstating the bits they took away. The goal: stuff build with old Guava versions still works.<br />
<br />
I'm starting to look at option four. Biggest issue: cost of maintenance.<br />
<br />
There's also the fact that once we use our own naming "org.apache.hadoop/hadoop-guava-fork" then maven and ivy won't detect conflicting versions, and we end up with > 1 version of the guava JARs on the CP, and we've just introduced a new failure mode.<br />
<br />
Java 9 is the one that has the best long-term potential, but at the same time, the time it's taken to move production clusters onto Java 8 makes it 18-24 months out at a minimum. Is that so bad though?<br />
<br />
I actually created the <a href="https://issues.apache.org/jira/browse/HADOOP-11123">"Move to Java 9"</a>: JIRA in 2014. It's been lurking there, Akira Ajisaka doing the equally unappreciated step-by-step movement towards it.<br />
<br />
Maybe I should just focus some spare-review-time onto Java 9; see what's going on, review those patches and get them in. That would set things up for early adopters to move to Java 9, which, for in-cloud deployments, is something where people can be more agile and experimental.<br />
<br />
(photo: someone painting down in Stokes Croft. Lost Crew tag)<br />
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-47369749931982212912017-04-12T12:26:00.001+01:002017-04-12T12:26:21.402+01:00Mocking: an enemy of maintenance<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/33306020986/in/dateposted/" title="Bristol spring"><img alt="Bristol spring" height="375" src="https://c1.staticflickr.com/1/722/33306020986_e0892c392e.jpg" width="500" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
I'm keeping myself busy right now with <a href="https://issues.apache.org/jira/browse/HADOOP-13786" target="_blank">HADOOP-13786</a>, an O(1) committer for job output into S3 buckets. The classic filesystem relies on rename() for that, but against S3 rename is a file-by-file copy whose time is O(data) and whose failure mode is "a mess", amplified by the fact that an inconsistent FS can create the illusion that destination data hasn't yet been deleted: false conflict. <br />
. This creates failures like <a href="https://issues.apache.org/jira/browse/SPARK-18512">SPARK-18512</a>., F<i>ileNotFoundException on _temporary directory with Spark Streaming 2.0.1 and S3A</i>, as well as long commit delays.<br />
<br />
I started this work a while back, making changes into the S3A Filesystem to support it. I've stopped focusing on that committer, and instead pulled in <a href="https://github.com/rdblue/s3committer">the version which Netflix have been using</a>, which has the advantages of a thought out failure policy, and production testing. I've been busy merging that with the rest of the S3A work, and am now at the stage where I'm switching it over to the operations I've written for the first attempt, the "magic committer". These are in S3A, where they integrate with S3Guard state updates, instrumentation and metrics, retry logic, etc etc. All good.<br />
<br />
The actual code to do the switchover is straightforward. What is taking up all my time is fixing the mock tests. These are failing with false positives "I've broken the code", when really the cause is "these mock tests are too brittle". In particular, I've had to rework how the tracking of operations goes, as a Mock Amazon S3Ciient is no longer used by the committer, instead its associated with the FS instance, which then is shared by all operations in a single test method. And the use of S3AFS methods shows up where its failing due to the mock instance not initing properly. I ended up spending most of Tuesday simply implementing the abort() call, now I'm doing the same on commit(). The production code switches fine, it's just the mock stuff.<br />
<br />
This has really put me off mocking. I have used it sporadically in the past, and I've occasionally had to work other people's. Mocking has some nice features<br />
<ul>
<li>Can run in unit tests which don't need AWS credentials, so Yetus/Jenkins can run them on patches.</li>
<li>Can be used to simulate failures and validate outcomes.</li>
</ul>
But the disadvantage is I just think they are too high maintenance. One test I've already migrated to being an integration test against an object store; I retained the original mock one, but just deleted that yesterday as it was going to be too expensive to migrate, and, with<br />
that IT test, obsolete.<br />
<br />
The others, well: the changes for abort() should help, but every new S3A method that gets called triggers new problems which I need to address. This is, well, "frustrating".<br />
<br />
It's really putting me off mocking. Ignoring the Jenkins aspect, the key benefit is structure fault injection. I believe I could implement that in the IT tests too, at least in those tests which run in the same JVM. If I wanted to, I could probably even do it in the forked VMs by f propagating details on the desired failures to the processes. Or, if I really wanted to be devious, by running an HTTP proxy in the test VM and simulating network failures for the AWS client code itself to hit. That wouldn't catch all real-world problems (DNS, routing), but I could raise authentication, transient HTTP failures, and of course, force in listing inconsistencies. This is tempting, because it will help me qualify the AWS SDK we depend on, and could be re-used for testing the Azure storage too. Yes, it would take effort —but given the cost of maintaining those Mock tests after some minor refactoring of the production code, it's starting to look appealing.<br />
<br />
(photo: Garage door, Greenbank, Bristol)Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-20565197271034847662017-04-11T21:43:00.000+01:002017-04-20T19:19:05.218+01:00The interruption economyWith the untimely death of a laptop in Boston in February, I've rebuilt two laptops recently.<br />
<br />
The first: a replacement for the dead one: a development macbook pro wired up to the various bits of work infra: MS office, VPN, even hipchat. The second, a formerly dead 2009 macbook brought back to life with a 256GB SSD and a boost of its RAM to 8GB (!).<br />
<br />
Doing this has brought home to be a harsh truth<br />
<br />
The majority of applications you install on an OSX laptop consider it not just a right, but a duty, to interrupt you while you are trying to work.<br />
<br />
It's not just the things where someone actually want's to talk to (e.g. skype), it's pretty much everything you can install<br />
<br />
For example, iTunes wants to be able to interrupt me, including playing sounds. It's a music player application, and it also wants to make beeping noises? Same for spotify. Why should background music apps or foreground media playback apps think they need to be able to interrupt you when they are running in the background?<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/33971592425/in/dateposted/" title="iTunes wants to interrupt me"><img alt="iTunes wants to interrupt me" height="263" src="https://c1.staticflickr.com/3/2824/33971592425_b7b4804b01_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
Dropbox. I didn't realise this was doing notifications until it suddenly popped up to tell me the good news that it was keeping itself up to date automatically.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/33971593645/in/dateposted/" title="Dropbox interrupting me with a random fact"><img alt="Dropbox interrupting me with a random fact" height="320" src="https://c1.staticflickr.com/3/2943/33971593645_7623725f66_n.jpg" width="271" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
Keeping your installation up to date is something we should expect all applications to do. It should not be so important that you should pop up a dialog box "good news, you are only at risk from 0-day exploits we haven't found or patched yet!". Once I was aware that dropbox was happy to interrupt me, I went to its settings, only to discover that it also wants to interrupt me on "comments, share's and @mentions", and on synced files. <br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/33595767560/in/dateposted/" title="Dropbox wants to harass me"><img alt="Dropbox wants to harass me" height="288" src="https://c1.staticflickr.com/4/3939/33595767560_ed115e5388_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
I hadn't noticed that a tool I used to sync files across machines had evolved into a groupware app where people could @mention me, but clearly it has, and in teams, interruptions whenever someone comments on things is clearly considered good. It also wants to interrupt me on files syncing. Think about that. We have an application whose primary purpose is "synchronising files across machines", and suddenly it wants to start popping up notifications when it is doing its job? What else should we have? Note taking applications sharing the good news that they haven't crashed yet?<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/33971601655/in/photostream/" title="Apple Notes wants to interrupt me"><img alt="Apple Notes wants to interrupt me" height="263" src="https://c1.staticflickr.com/3/2917/33971601655_442e5bca2d_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
Maybe, because amongst the apps which also consider interruption and inalienable right are: OneNote and macOS notes app. I have no idea what they want to interrupt me about: Notes doesn't specify what it wants to alert me about, only that it wants to notify me on locked screens and make a noise. OneNote? Lets you spec which notebooks can trigger interrupts, but again, the why is missing.<br />
<br />
The list goes on. My password manager, text editor, IDE. Everything I install defaults to interrupting me.<br />
<br />
Yes, you can turn the features off, but on a newly installed machine, that means that you have to go through every single app and disable every single interruption point. Miss out some small detail and while you are trying to get some work done, something pops up to say "lucky you! Something has happened which Photos thinks it is so important you should stop what you are doing and use it instead!". when you are building up two laptops, it means there's about 20+ times I've had to bring up the notifications preference pane, scroll down to whichever app last interrupted me, turn off all its notifications, then continue until something else chooses to break my concentration.<br />
<br />
The web browsers want to let web pages interrupt you too. <br />
<br />
Firefox you can't disable it, at least not without delving into about:config.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/33971605215/in/photostream/" title="Firefox doesn't seem to let me utterly disable interrupts"><img alt="Firefox doesn't seem to let me utterly disable interrupts" height="162" src="https://c1.staticflickr.com/3/2933/33971605215_e0ee1ba063_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
You can block it in the OS notifications settings, which implies it is at least integrated with the OS and the system-wide do-not-disturb feature.<br />
<br />
<br />
Chrome: you can manage it in the browser —even though google don't want you to stop it, but it doesn't appear to integrated with the OS;<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/33971602895/in/photostream/" title="Google chrome recommends interruptiblity"><img alt="Google chrome recommends interruptiblity" height="100" src="https://c1.staticflickr.com/3/2876/33971602895_8a2e60331b_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
With the OS integration, OSX's do-not-disturb feature won't work. will work here, so if you do let Chrome notify you, webapps gain the right to interrupt you during presentations, watching media content, etc.<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/33813813112/in/photostream/" title="Safari lets you disable web site notifications, you just have to clear the check box"><img alt="Safari lets you disable web site notifications, you just have to clear the check box" height="257" src="https://c1.staticflickr.com/3/2896/33813813112_c4fd8925aa_n.jpg" width="320" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
Safari? Permitted, but OS controlled, completely blockable. This doesn't mean that webapps shouldn't be able to interrupt you: google calendar is a good example, it's just the easier we make it to do this, the more sites will want to.<br />
<br />
<br />
The OS isn't even consistent itself. There is no way to tell time machine to not annoy you with the fact that it hasn't updated for 11 days. It's not part of the notification system, even though it came from the same building. What kind of example is that to set for others?<br />
<br />
<br />
Because the default behaviour of every application is to interrupt, I have to go through every single installed app to disable it else my life is a constant noise of popups stating irrelevant facts. You may not notice that as you install one application at a time, turning off the settings individually, but when you build up a new box, the arrogance of all these applications becomes obvious, as it takes some time to actually stop your attention being attacked by the software you install.<br />
<br />
Getting users to look at your app, your web site, is roped in as "The attention economy". That certainly applies to things like twitter, facebook, snapchat, etc. But how does translate into dropbox trying to get my attention to tell me that it's keeping itself up to date? Or whatever itunes or photos wants to interrupt me on? Why does OneNote need to tell me something about a saved workbook? This isn't "<i>the attention economy</i>". This is "i<i>nterruption economy</i>": people terrified that users may not be making full use of their features, so trying to keep popping up to encourage you to use the app or whatever new feature they've just installed <br />
<br />
Interrupting people while they are trying to work is not a good use of the life of people whose work depends on "getting things done without interruptions". As my colleagues should know, though some of them forget, I don't run with hipchat on precisely because I hate getting popups "hey Steve, can i just ask..." , where the ask is something that I'd google for the answer myself, so why somebody asks me to google for them, I don't know. But even with the workflow interrupts off, things keep trying to stop me getting anything done<br />
<br />
Then there's the apps which interrupt without any warning at all. I got caught out at this at Dataworks summit, where halfway through a presentation GPGMail popped up telling me there was a new version. This was a presentation where I'd explicitly set "do not disturb" on and war running full screen, but GPG mail checks weren't using it. Lesson: turn off the wifi as well as setting everything to do-not-disturb/offline.<br />
<br />
Those update prompts, they are important. But everything keeps going "update me! now!" they end up being an irritant to ignore, just like the way the "service now!" alert pops up our car when we use it. It's just another low-level hint, not something which matters like "low pressure in tyres".<br />
<br />
What it does really highlight is that having an applications keep itself up to date with security patches is still considered, on OSX, to be something worth interrupting the user to let them know about. All I can say it's a good thing that Linux apps don't feel the same way, or apt-get upgrade would be unbearable.<br />
<br />
<br />
Finally, there's the OS<br />
<ul>
<li>It'd be good if the OS recognised when a full screen media/presentation app was underway and automatically went into silent mode at that point.</li>
<li>All the OS's own notifications "upgrade available", "no time machine backups" should be integrated with the same notification mechanisms for app viewers. That's to help the users, but also set an example for all others.</li>
</ul>
<br />
What to to really do about it? <br />
<br />
I'd really like to be able to tell the OS that the default settings for any newly installed app is "no notifications". Maybe now I've built up the laptops I won't have to go through the torment of disabling it across many apps, so it'll just be that case by case irritant. Even so, there's still the pain of being reminded of update options even <br />
<br />
What I can do though, is promise not to personally write applications which interrupt people by default.<br />
<br />
Here then, is my pledge:<br />
<ol>
<li>I pledge to give my users the opportunity to live a life free of interruptions, at least from my own code.</li>
<li>I pledge not to write applications which bring up notification boxes to tell you that they have kept themselves up to date automatically, that someone has logged in to another machine, or that someone else is viewing a document a user has co-authored.</li>
<li>Ideally, the update mech should integrate that from the OS, and so it can handle the notifications (or not). </li>
<li>If I then add a notifications in an application for what I consider to be relevant information, I pledge for the default state to be "don't".</li>
<li>They will all go away when left alone. </li>
<li>Furthermore, I pledge to use the OS supplied mechanism and integrate with any do- not-disturb mechanism the OS implements.</li>
</ol>
I know, I haven't done do client side code for a long time, but I can assure people, if I did: I'd try to be much less annoying than what we have today. Because I recognise how much pain this causes.<br />
<ol></ol>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-89621758789930035062017-03-02T22:41:00.002+00:002017-03-02T22:42:48.709+00:00The Great S3 Outage of February 2017On tuesday the world split into different groups<br />
<ol><li>Those who knew that S3 was down, and the internet itself was in crisis.</li>
<li>Those who knew that some of the web sites and phone apps they used weren't working right, but didn't know why.</li>
<li>Those who didn't notice and wouldn't have cared.</li>
</ol><br />
I was obviously in group 1, the engineers, who whisper to each other, "where were you when S3 went down".<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/33086895521" title="S3 Outage: Increased Error Rate"><img alt="S3 Outage: Increased Error Rate" height="92" src="https://c1.staticflickr.com/3/2845/33086895521_fcbf323566.jpg" width="500" /></a><br />
<br />
<br />
I was running the latest hadoop--aws s3a tests, and noticed as some of my tests were failing. Not the ones to s3 Ireland, but those against the landsat bucket we use in lots of our hadoop test as it is a source of a 20 MB CSV file where nobody has to pay download fees, or spend time creating a 20 MB CSV file. Apparently there are lots of landsat images too, but our hadoop tests stop at: seeking in the file. I've a spark test which <a href="https://github.com/steveloughran/spark-cloud-examples/blob/master/cloud-examples/src/main/scala/com/hortonworks/spark/cloud/s3/S3ALineCount.scala" target="_blank">does the whole CSV parse thing</a>., as well as one <a href="https://github.com/steveloughran/spark-cloud-examples/blob/master/cloud-examples/src/main/scala/com/hortonworks/spark/cloud/examples/S3DataFrameExample.scala" target="_blank">I use in demos</a> as an example not just of dataframes against cloud data, but of how data can be dirty, such as with a cloud cover of less than 0%.<br />
<br />
Partial test failures: never good.<br />
<br />
It was only when I noticed that other things were offline that I cheered up: unless somehow my delayed-commit multipart put requests had killed S3: I wasn't to blame. And with everything offline I could finish work at 18:30 and stick some lasagne in the oven. (I'm fending for myself & keeping a teenager fed this week).<br />
<br />
What was impressive was seeing how deep it went into things. Strava app? toast. Various build tools and things? Offline.<br />
<br />
Which means that S3 wasn't just a SPOF for my own code, but a lot of transitive dependencies, meaning that things just weren't working -all the way up the chain.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/33058043062/in/dateposted/" title="S3 Outage: We can update our status page"><img alt="S3 Outage: We can update our status page" height="161" src="https://c1.staticflickr.com/4/3911/33058043062_eb8f3750fb.jpg" width="500" /></a><br />
<br />
S3 is clearly so ubiquitous a store that the failure of US-East enough to have major failures, everywhere.<br />
<br />
Which makes designing to be resilient to an S3 outage so hard: you not only have to make your own system somehow resilient to failure, you have to know how your dependencies cope with such problems. For which step one is: identify those dependencies.<br />
<br />
Fortunately, we all got to find out on Tuesday.<br />
<br />
Trying to mitigate against a full S3A outage is probably pretty hard. At the very least,<br />
<ol><li>replicated front end content across different S3 installations would allow you to present some kind of UI.</li>
<li>if you are collecting data for processing, then a contingency plan for the sink being offline: alternate destinations, local buffering, discarding (nifi can be given rules here).</li>
<li>We need our own status pages which can be updated even if the entire infra we depend on is missing. That is: host somewhere else, have multiple people with login rights, so an individual isn't the SPOF. Maybe even a facebook page too, as a final backup </li>
<li>We can't trust the AWS status page no more.</li>
</ol>Is it worth putting in lots of effort to eliminating an S3 outage as a SPOF? Well, the failure rate is such that it's a lot of effort for a very rare occurence. If you are user facing, some app like strava, maybe it's easiest to say "no". If you are providing a service for others though, availability, or at least the ability to degrade QoS is something to look at.<br />
<br />
Anyway, we can now celebrate the fact that the entire internet now runs in four places: AWS, Google, Facebook and Azure. And we know what happens when one of them goes offline.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-29933348210593898842017-02-21T21:55:00.002+00:002017-02-21T22:08:17.014+00:00Why HTTPS is so essential, and auto-updating apps so dangerousI'm building up two laptops right now. One, a work one to replace the four year old laptop which died. The other, a mid 2009 macbook pro which I've refurbed with an SSD and clean built up.<br />
<br />
As I do this, I'm going through every single thing I'm installing to make sure I do somewhat trust it. That's me ignoring homebrew and where it pulls stuff from when I type something like "brew install calc". What I am doing is checking the provenance of everything else I pull down: validating any SHA-256 hashes they declare; making sure they come off HTTPS URLs, etc. The foundational stuff.<br />
<br />
We have to recognise that serving software up over HTTP is something to be phasing out, and, if it is done, for the SHA-256 checksum to be published over HTTPS, or, even better, for the checksum to be signed by a GPG key, after which it can be served anywhere. while OSX <a href="https://developer.apple.com/library/content/technotes/tn2206/_index.html#//apple_ref/doc/uid/DTS40007919-CH1-TNTAG18" target="_blank">supports signed DMG files since OS/X El Capitan</a>, and unless you expect the disk image to be signed, you aren't going to notice when you pick up an unsigned malware variant.<br />
<br />
It's too easy for an open wifi station to redirect HTTP connections to somewhere malicious, and we all roam far too much. I realised while I was travelling, that all it would take to get lots of ASF developers on your malicious base station is simply to bring it up in the hotel foyer or in a quiet part of the conference area, giving it the name of the hotel or conference respectively. We conference-goers don't have a way to authenticate these wifi networks.<br />
<br />
Anyway, most binaries I am downloading and installing are coming off HTTPS, which is reassuring.<br />
<br />
One that doesn't is virtualbox: Oracle are still serving these up over HTTP. They do at least <a href="https://www.virtualbox.org/download/hashes/5.1.14/SHA256SUMS" target="_blank">serve up the checksums over HTTP</a>, but they don't do much in highlighting how much checking matters. No "to ensure that these binaries haven't been replaced by malicious one anywhere between your laptop and us, you MUST verify the checksums. No, it's just a mild hint, <i>" You might want to compare the SHA256 checksums or the MD5 checksums to verify the integrity of downloaded packages"</i>.<br />
<br />
Not HTTPS then, but with the artifacts something whose checksum I can validate from HTTPS. These are on the dev box, happily.<br />
<br />
But here's something that I've just installed on the older, household laptop, "dogbert": <a href="http://software.garmin.com/en-GB/express-download.html#mac" target="_blank">Garmin Express.</a> This is little app which looks at the data in a USB mounted Garmin bike computer, grabs the latest activities and updates them to Garmin's cloud infrastructure, where they make their way to Strava, somehow. Oh, and pushes firmware updates the other direction.<br />
<br />
The Garmin Express application is downloaded over HTTP, no MD5, SHA1 or anything else. And while the app itself is signed, OSX can and will run unsigned apps if the permissions are set. I have to make sure that the "allow from anywhere" option is not set in the security panel before running any installer.<br />
<br />
Here's the best bit though: that application does auto updates, any time, anywhere.<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/32883803322/in/dateposted/" title="Garmin Express D/Ls from HTTP; autoupdate by default"><img alt="Garmin Express D/Ls from HTTP; autoupdate by default" height="308" src="https://c1.staticflickr.com/3/2677/32883803322_a61370988f.jpg" width="500" /></a><br />
Which means that little app, set to automatically run on boot, is out there checking for notifications of an updated application, then downloading it. It doesn't install it, but it will say "here's an update" and launch the installer.<br />
<br />
Could I use this to get something malicious onto a machine? Maybe. I'd have to see if the probes for updates were on HTTP vs HTTPS, and if HTTP, what the payload was. If it was HTTPS, well, you are owned by whoever h<a href="https://support.apple.com/en-us/HT202858" target="_blank">as their CAs installed on your system</a>. That's way out of scope. But if HTTP is used, then getting the Garmin app to install an unsigned artifact looks straightforward. In fact, even if the update protocol is over HTTPS, given the artifact names of the updates can be determined, you could just serve up malicious copies all the time and hope that someone picks it up That's less aggressive through, and harder to guarantee any success from subverted base stations at a conference.<br />
<br />
Rather than go to the effort of wireshark, we can play with lsof to see what network connections are set up on process launch<br />
<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"># lsof -i -n -P | grep -i garmin<br />Garmin 9966 12u 0x5ccb80e39679382b 192.168.1.18:55235->40.114.241.141:443<br />Garmin 9966 16u 0x5ccb80e39679382b 192.168.1.18:55235->40.114.241.141:443<br />Garmin 9967 10u 0x5ccb80e396b4a82b 192.168.1.18:55233->2.17.221.5:443<br />Garmin 9967 13u 0x5ccb80e39687182b 192.168.1.18:55234->2.17.221.5:443<br />Garmin 9967 15u 0x5ccb80e3910b7a1b 192.168.1.18:55236->2.17.221.5:443<br />Garmin 9967 16u 0x5ccb80e39669e63b 192.168.1.18:55237->2.17.221.5:443<br />Garmin 9967 17u 0x5ccb80e396b4a82b 192.168.1.18:55233->2.17.221.5:443<br />Garmin 9967 18u 0x5ccb80e39687182b 192.168.1.18:55234->2.17.221.5:443<br />Garmin 9967 19u 0x5ccb80e3910b7a1b 192.168.1.18:55236->2.17.221.5:443<br />Garmin 9967 20u 0x5ccb80e3960c782b 192.168.1.18:55238->2.17.221.5:443<br />Garmin 9967 21u 0x5ccb80e39669e63b 192.168.1.18:55237->2.17.221.5:443<br />Garmin 9967 22u 0x5ccb80e3979fa63b 192.168.1.18:55239->2.17.221.5:443<br />Garmin 9967 23u 0x5ccb80e3910b4d43 192.168.1.18:55240->2.17.221.5:443<br />Garmin 9967 24u 0x5ccb80e3910b4d43 192.168.1.18:55240->2.17.221.5:443<br />Garmin 9967 25u 0x5ccb80e3979fa63b 192.168.1.18:55239->2.17.221.5:443<br />Garmin 9967 26u 0x5ccb80e3960c782b 192.168.1.18:55238->2.17.221.5:443</span></span><br />
<br />
2.17.221.5 turns out to be https://garmin.com/, so it is at least checking in over HTTPS there. What about the 40.114.241.141 address? Interesting indeed. tap that into firefox as https://40.114.241.141 and then go through the advanced bit of the warning, and you can see that the certificate served up is valid for a set of hosts:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">dc.services.visualstudio.com, eus-breeziest-in.cloudapp.net, eus2-breeziest-in.cloudapp.net, cus-breeziest-in.cloudapp.net, wus-breeziest-in.cloudapp.net, ncus-breeziest-in.cloudapp.net, scus-breeziest-in.cloudapp.net, sea-breeziest-in.cloudapp.net, neu-breeziest-in.cloudapp.net, weu-breeziest-in.cloudapp.net, eustst-breeziest-in.cloudapp.net, gate.hockeyapp.net, dc.applicationinsights.microsoft.com </span></span><br />
<br />
That's interesting because it means its something in azure space. in particular, rummaging around brings up hockeyapp.net as a key possible URL, given that Hockeyapp Is a <a href="https://hockeyapp.net/#s" target="_blank">monitoring service for instrumented applications</a>. I distinctly recall selecting "no" when asked if I wanted to participate in the "help us improve our product" feature, but clearly something is being communicated. All these requests seem to go away once app launch is complete, but it may be on a schedule. At least now I can be somewhat confident that the checks for new versions are being done over HTTPS; I just don't trust the downloads that come after.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-7814043810728702372017-02-15T21:12:00.003+00:002017-02-15T21:19:54.545+00:00Towards a doctrine of the Zero DayThe <a href="http://www.nytimes.com/2012/06/01/world/middleeast/obama-ordered-wave-of-cyberattacks-against-iran.html?_r=0" target="_blank">Stuxnet/Olympic game</a>s malware is awesome and the engineering teams deserve respect. There, I said it. The first in-the-field sighting of a mil-spec virus puts the mass market toys to shame. It is the difference between the first amateur rockets and the <a href="http://www.bbc.co.uk/history/ww2peopleswar/stories/78/a1302878.shtml">V1 cruise</a> and V2 ballistic missiles launched against the UK in WWII. It also represents that same change in warfare. <br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/32524923145/in/dateposted/" title="V1 Cruise missle and V2 rocket"><img alt="V1 Cruise missle and V2 rocket" height="500" src="https://c1.staticflickr.com/1/710/32524923145_e48e31e439.jpg" width="379" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
I say this having watched the documentary Zero Days about nation-state hacking. One thing I like about it is it's underdramatization of the coders. Gone the clichéd angled shots of the hooded faceless hacker coding in darkness to a bleeping text prompt on a screen that looks like something from the matrix. Instead: offices with fluorescent lights compensating for the fact that the only people allocated windows are managers. What matrix-esque screen shots there were contained x86 assembly code in the font of IDA, showing asm code snippets accurate enough to give me flashbacks of when I wrote Win32/C++ code. Add some music and coffee mugs and it'd start to look like the real world.<br />
<br />
The one thing they missed out on is the actual engineering; the issue tracker, with OLYMPIC-342, "doesn't work with Farsi version of Word" being the topic of the standup; the monthly regression test panic when when windows or flash updates shipped and everyone feared the upgrade had fixed the exploits. Classic engineering, hampered by the fact that the end users would never send stack traces. Even determining if your code worked in production would depend on intermittent status reports from the UN or order numbers for new parts from down the centrifuge supply chain. Let's face it: even getting the test hardware must have been an epic achievement of its own.<br />
<br />
Because Olympic Games was not just a piece of malware using multiple zero days and stolen driver certificates to gain admin access on gateway systems before jumping the airgap over USB keys and then slowly sabotage the Iranian centrifuges. It was evidence that the government(s) behind decided that cyber-warfare (a term I really hate) had moved from a theoretical "look, this uranium stuff has energy" to the strategic "let's call this the manhattan project"<br />
<br />
And it showed that they were prepared to apply their work against a strategic asset of another country, during peacetime. And had a larger program Nitro Zeus, intended to be the opening move of a war with Iran. <br />
<br />
As with those missiles and their payloads, the nature of war has been redefined.<br />
<br />
In Churchill's epic five volume history of WWII, he talks about the D-day landings, and how he wanted to watch it from a destroyer, but was blocked by King George, you ware too valuable". Churchill wrote that everyone on those beaches felt that they were too valuable to be there too -and that the people making the decisions should be there to see the consequences of them. He shortly thereafter goes on to discuss the first V1 attacks on London, discussing their morality. He felt that the "war-head". (a new word) was too indiscriminate. He was right - but given this was 14 months ahead of August 1945, his morality didn't run that deep. Or the V1 and V2 bombings had convinced him that it was the future. (Caveat: I've ignored <a href="http://www.spiegel.de/international/europe/controversial-memorial-to-british-wwii-bombers-to-open-a-840858.html" target="_blank">RAF Bomber Command</a> as it would only complicate this essay). <br />
<br />
Eric Schlosser's book, <i>Command and Control</i>, discussed the post-war evolution of defence strategy in a nuclear age, and how nuclear weapons scared the military. before: 1000 bombers to destroy a city like Hamburg or Coventry. Now only one plane had to get through the air defences, and the country had lost. Which changed the economics and logistics of destroying nearby countries. The barrier to entry had just been reduced. <br />
<br />
The whole strategy of Mutually Assured Destruction evolved there, which, luckily for us, managed to scrape us though to the twenty-first century: to now. But that doctrine wasn't immediate, and even there, the whole notion of tactical vs. strategic armaments skirted around the fact that once the first weapons went off over Germany or Korea, things were going to escalate.<br />
<br />
Looking back though, you can see those step changes in technology and how the leading edge technologies of each war enabled the doctrine of the next. the US civil war: rifles, machine guns, ironclad naval vessels, the first wire obstacles on the battlefield. WWI: the trenches with their barbed wire and machine guns; planes and tanks the new tech, radio the emergent communications alongside those telegraphs issuing orders to "go over the top!" . WWII and Blitzkreig was built around planes and trains, radio critical to choreograph it; the Spanish civil war used to hone the concept and to inure Europe to the acceptance of bombing cities. <br />
<br />
And in the Cold War, as discussed, missiles, computers and nuclear weapons were the tools of choice. <br />
<br />
What now? Nuclear missiles are still the game-over weapons for humanity, but the non-nuclear weapons have changed and so the tactics of war have changed at. And just as the Manhattan Project showed how easy it was to flatten a city, the Olympic Games has shown how much damage you can do with laptops and a dedicated engineering team. <br />
<br />
One of the screenshots in the documentary was of <a href="https://www.hackread.com/how-bad-is-the-north-korean-cyber-threat/">the North Korean dev team</a>. They don't look like a dev team I'd recognise. It looks like the place where "breaking the build" carries severe punishment rather than having to keep the "I broke the build!" poster(*) up in your cubicle until a successor inherited it. But it was an engineering team, and a lot less expensive than their same government's missile program. And, it's something which can be used today, rather than used as a threat you dare not use. <br />
<br />
What now? We have the weapons, perhaps a doctrine will emerge. What's likely is that you'll see multiple levels of attack<br />
<br />
The 2016 election; the Sony hack: passive attack: data exfiltration and anonymous & selective release. We may as well assume the attacks are common, it's only in special cases that we get to directly see the outcome so tangibly. <br />
<br />
Olympic Games and <a href="https://ics.sans.org/media/Media-report-of-the-BTC-pipeline-Cyber-Attack.pdf">the rumoured BTC pipeline attack</a>: destruction of targets -in peacetime, with deniability. These are deliberate attacks on the infrastructures of nations, executed without public announcement. <br />
<br />
<a href="https://www.nytimes.com/2016/02/17/world/middleeast/us-had-cyberattack-planned-if-iran-nuclear-negotiations-failed.html" target="_blank">Nitro Zeus (undeployed) </a>: this is the one we all have to fear in scale, but do we have to fear it's use? As the opening move to an invasion, it's the kind of thing that could be deployed against Estonia or other countries previously forced into the CCCP against their will. Kill all communications, shut down the the cities and within 24h Russian Troops could be in there "to protect Russian speakers from the chaos". China as a precursor to a forced reunification with Taiwan. Then there's North Korea. It's hard to see what a country that irrational would do -especially if they thought they could get away with it. <br />
<br />
Us in the west?<br />
<br />
Excluding Iraq, the smaller countries that Trump doesn't like: Cuba, N. Korea lack that infrastructure to destroy. The big target would be his new enemy, China -but hopefully the entirety of new administration isn't that mad. So instead it becomes a deterrent against equivalent attacks from other nation states with suitable infrastructure. <br />
<br />
What we can't do though is use to as a deterrent for Stuxnet-class attacks, not just on account of the destruction it would cause, but because it's so hard to attribute blame. <br />
<br />
I suspect what is going to happen is something a bit like the evolution of the Drone Warfare doctrine under Obama: it'll become acceptable to deploy Stuxnet-class attacks against other countries, in peacetime. Trump would no doubt love the power, though his need to seek public adulation will hamper the execution. You can't deny your work when your president announces it on twitter. <br />
<br />
At the same time, I can imagine the lure of non-attributable damage to a competing nation state. Something that hurts and hinders them -but if they can't point the blame , what's not to lose.? That I could the Trump Regime going for -and if it does happen to, say, China, and they work it out -well, it's going to escalate. <br />
<br />
Because that has always been the problem with the whole tactical to strategic nuclear arsenal. Once you've made the leap from conventional to nuclear weapons, it was going to escalate all the way. <br />
<br />
Do we really think "cyber-weaponry" isn't going to go the same way? From deleting a few files, or shutting down a factory to disrupting transport, a power grid? <br />
<br />
(*) the poster was a photo of the George Bush "mission accomplished" carrier landing, as I recall.<br />
<br />Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-3912869276826546304.post-25329194802605975812017-01-28T15:01:00.000+00:002018-05-26T17:52:24.175+01:00TRIDENT-877 missile veered towards wrong continent; hemisphere<div dir="ltr" style="text-align: left;" trbidi="on">
Apparently a test of an submarine launched trident missile went wrong, it started to head in the wrong direction and chose to abort its flight. <a href="http://www.bbc.co.uk/news/world-latin-america-38772757">The payload ended up in the Bahamas</a>.<br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/32484173406/" title="Aeronautics Museum"><img alt="Aeronautics Museum" height="500" src="https://c1.staticflickr.com/1/655/32484173406_c16fe1ffc8.jpg" width="379" /></a><br />
<br />
The whole concept of software engineering came out of <a href="http://homepages.cs.ncl.ac.uk/brian.randell/NATO/nato1968.PDF">a NATO conference in 1968</a>. <br />
<br />
The military were the first to hit this, because they were building the most complex systems: airplanes, ships, submarines, content-wide radar systems. And of course: missiles.<br />
<br />
Missiles whose aim in life is to travel from a potentially mobile launch location to a preplanned destination, via a suborbital ballistic trajectory. It's inevitably a really complex problem: you've got a multistage rocket designed to be moved around in a submarine for decades, designed to be launched without much preparation at a target a few thousand miles away. Which must make the navigation a fun little problem. <br />
<br />
We can all use GPS to work out where we are, even spacecraft which know to use the other solution to the GPS timing equation - the one which doesn't have a solution close to the geode, our model of the Earth's surface. Submarines can't use GPS while under water and they, like their deliverables, can't rely on the GPS constellation existing at the time of use. Which leaves what? Gyroscopic compasses, and inertial navigation systems: mindnumbingly complex bits of sensor trying to work out acceleration on different axes, use that, time, and its knowledge of its starting point to work out where it is. Then there's a little computer nearby using that information to control the rocket engines. <br />
<br />
Once above enough of the atmosphere to see stars in daylight, the missiles switch to astronomy. This turns out to be <a href="http://fer3.com/arc/imgx/kaplan2.ppt" target="_blank">an interesting area of ongoing work</a> -IR CCDs can position vehicles at sea level when it's not cloudy (tip: always choose your war zones in desert climates). While the Trident missiles are unlikely to have been updated, a full submarine refresh is bound to have installed the shiny new stuff. And in an qualification test of a real launch -that's something you'd want to try. Though of course you would compare any celestial position data with the GPS feed.<br />
<br />
Yet somehow it failed. Apparently this was a "telemetry problem", the missile concluded that something had gone wrong and chose to crash into the sea instead. I'm really curious about the details now, though we'll never get the specifics at a level to be that informative. First point: telemetry from the submarine to the missile? That is, something tracking the launch and providing (authenticated?) data to the missile which it could compare with its own measures? Or was it the other way around: missile data to submarine? As that would seem more likely -having the missile broadcast out an encrypted stream of all its engine data and sensor input would be exactly what you want to identify launch time problems. Perhaps it was some new submarine software which got confused, or got fed bad data somehow. If that was the case, then, if you could replicate the failure by feeding in the same telemetry, then yes, you could fix it and be confident that the specific failure was found and addressed. Except: you can't be confident that there weren't more problems from that telemetry, or other things to go wrong -problems which didn't show up as the missile had been aborted<br />
Or it was in-missile; sensor data on the rockets misleading the navigation system. In which case: why use the term "telemetry".<br />
<br />
We aren't ever going to know the details, which is a pity as it would be interesting to know. It's going to be kept a secret though, not just for the sake of whoever we consider our enemies to be —but because it would scare us all. <br />
<br />
I don't see that you can say the system is production ready if there was any software problem. One with wiring up, maybe, or some other hardware problem where a replacement board -a well qualified board- could be swapped in. Maybe even an operations issue which can be addressed with changes in the runbook. But software? No.<br />
<br />
How do you show it works then? Well, testing is the obvious tactic, except, clearly, we can't afford to. Which is a good argument in favour of cruise missiles over ICBMs: they cost less to test. <br />
<br />
<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/32484174396/" title="Tomahawk Cruise missile"><img alt="Tomahawk Cruise missile" height="500" src="https://c1.staticflickr.com/1/366/32484174396_31857b762c.jpg" width="379" /></a><br />
<br />
Governments just don't take into account the software engineering and implementation details of modern systems into account, of which missiles are a special case, but things like the F-35 Joint Strike Fighter another. Some the software from that comes from BAe Systems a few miles away, and from what I gather, it's a tough project. The usual: over-ambitious goals and deadlines, conflicting customers, integration problems, suppliers blaming each other, etc, etc. Which is why <a href="http://www.dote.osd.mil/pub/reports/FY2016/pdf/dod/2016f35jsf.pdf" target="_blank">the delivery and quality of the software is called out a a key source of delays</a>, this in what is self-admittedly <a href="http://www.baesystems.com/en/product/f-35-lightning-ii">the world's largest defence programme.</a><br />
<br />
It's not that the teams aren't competent —it's that the systems we are trying to build are beyond what we can currently do, despite that ~50+ years of Software Engineering.<br />
<br />
<b>Update 2018-05-26</b>: when searching for this page with google, it turns out that "Trident 877" turns up parliamentary e<a href="https://www.parliament.uk/edm/2016-17/877">arly day motion 877</a> on this very topic. Coincidence!</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3912869276826546304.post-55520488111545047502016-12-01T18:26:00.001+00:002016-12-01T18:32:04.421+00:00How long does FileSystem.exists() take against S3?<a data-flickr-embed="true" href="https://www.flickr.com/photos/steve_l/31321873606/in/dateposted/" title="Ice on the downs"><img alt="Ice on the downs" height="375" src="https://c7.staticflickr.com/6/5324/31321873606_ced7944350.jpg" width="500" /></a><script async="" charset="utf-8" src="//embedr.flickr.com/assets/client-code.js"></script><br />
<br />
One thing I've been working on with my colleagues is improving performance of Hadoop, Hive and Spark against S3, one <tt>exists()</tt> or <tt>getFileStatus()</tt> call at a time.<br />
<br />
Why? This is a log of a test run showing how long it takes to query S3 over a long haul link. This is midway through the test, so the HTTPS connection pool is up, DNS has already resolved the hostnames. So these should be warm links to S3 US-east. Yet it takes over a second just for one probe.<br />
<pre>2016-12-01 15:47:10,359 - op_exists += 1 -> 6
2016-12-01 15:47:10,360 - op_get_file_status += 1 -> 20
2016-12-01 15:47:10,360 (S3AFileSystem.java:getFileStatus) -
Getting path status for s3a://hwdev-stevel/numbers_rdd_tests
2016-12-01 15:47:10,360 - object_metadata_requests += 1 -> 39
2016-12-01 15:47:11,068 - object_metadata_requests += 1 -> 40
2016-12-01 15:47:11,241 - object_list_requests += 1 -> 21
2016-12-01 15:47:11,513 (S3AFileSystem.java:getFileStatus) -
Found path as directory (with /)
</pre>
The way we check for a path p in Hadoop's S3 Client(s) is<br />
<pre>HEAD p
HEAD p/
LIST prefix=p, suffix=/, count=1
</pre>
A simple file: one HEAD. A directory marker, two, a path with no marker but 1+ child: three. In this run, it's an empty directory, so two of the probes are executed:<br />
<pre>HEAD p => 708ms
HEAD p/ => 445ms
LIST prefix=p, suffix=/, count=1 => skipped
</pre>
That's 1153ms from invocation of the <tt>exists()</tt> call to it returning true —long enough for you to see the log pause during the test run. Think about that: determining which operations to speed up not through some fancy profiler, but watching when the log stutters. That's how dramatic the long-haul cost of object store operations are. It's also why a core piece of the S3Guard work is to <a href="https://issues.apache.org/jira/browse/HADOOP-13449">offload that metadata storage to DynamoDB</a>. I'm not doing that code, but I am doing the <a href="https://issues.apache.org/jira/browse/HADOOP-13786">committer to go with</a>. To be ruthless, I'm not sure we can reliably do that O(1) rename, massively parallel rename being the only way to move blobs around, and the committer API as it stands precluding me from implementing a single-file-direct-commit committer. We can do the locking/leasing in dynamo though, along with the speedup.<br />
<br />
What it should really highlight is that an assumption in a lot of code "getFileStatus() is too quick to measure" doesn't hold once you move into object stores, especially remote ones, and that any form of recursive treewalk is potentially pathologically bad. <br />
Remember that that next time you edit your code.Unknownnoreply@blogger.com1