Twitter celebrated its 10th birthday this week, and those who have been on that social network long enough know that at least once a week there’s a massive outrage about something that, in the end, usually does not seem so bad. This week’s topic: someone
broke the Internet!
Wait, break the Internet? Well, sort of. In short, a package named
“left-pad” was removed from the official NPM repository. The action in itself sucks, then again the owner of the package
and build-time issues because of that broken dependency. It hit
press, and some
bloggers gave their opinion on the issue. And here’s my opinion…
First of all, I think it’s insane to take a dependency on a package that pads a string with zeroes and contains 11 lines of useful source code. These utility functions typically go in your own codebase, but I agree this is debatable. But to me, taking a
dependency for something as trivial as that is a bit crazy – it’s like hiring an assistant to tie your shoe laces.
Second, while this all happened in NPM land, this could also happen in NuGet, Maven, Componist, PyPi, Gem and other package managers. Writing code in 2016? Then let me rephrase that: this could happen to you! Someone else can break your build! Imagine what
would happen if all of a sudden Newtonsoft.Json was removed from NuGet.org…
In my opinion, public repositories should, never, ever, allow package deletes. NuGet.org doesn’t allow this (except when there’s legal/copyright stuff involved, which happened
once in its 6 year lifetime). And I think other package managers should have the same policy. No deletes. Period.
Of course, there are edge cases like accidental publishes – it should be possible to remove those. But if a package has been downloaded more than, say 10 times, it should stay. No exceptions.
Flashback to 2014. NuGet started to take off with early adopters and smart people all around. The package manager introduced package restore – a way to not have your dependencies in your source control system. Some people were
wary, others responded in
full sarcasm mode (damn I’m a sarcastic bastard sometimes). From a blog post I wrote in 2014:
Just like with source control, issue trackers and other things (like package restore) in your build process, you should read up on them, play with them and know the risks. Do we know that our Internet connection can break during solar storms? Well yes. It’s
a minor risk but if it’s important to your shop do mitigate that risk. Do laptops break? Yes. If it’s important that you can keep working even if a laptop crashes, buy some more and keep them up-to-date with your main development machine. If you rely on GitHub
and want to get work done if they have issues, make sure you have an up to date fork somewhere on a file share. Make that two file shares!
And if you rely on NuGet package restore… you get the point, right? For NuGet, there are private repositories available that can host your in-house packages
and the ones you are using from upstream sources like NuGet.org. Use them, if they matter for your development process. Know about NuGet 2.8’s automatic fallback to the local cache you have on disk and if something goes wrong, use that cache
until the package source is back up.
The development process and the tools are part of your system. Know your tools. Even if it requires you to read crazy books like how to work with git. Or
Pro NuGet 2.
See that bold highlight? That’s basically the exact same thing I want to point out in this blog post. If you depend on a package that is critical to you, then mirror it. There are various in-house and hosted package repositories available,
for example in the NuGet space (MyGet has been around
since 2011 for exactly this reason).
If it is life threatening, mirror your dependencies. If you’re okay with hanging out in a bar for an afternoon if an upstream repository is down for a bit, or re-writing left-padding code because a package has been removed, then don’t mirror. Know your risks,
think about how much of a threat they present to you, and act accordingly. (keyword here is: think)
But I need it so bad!
For those of you who did depend on left-pad and did not an to take action: NPM (and NuGet, and…) typically store a huge amount of packages on every developer and CI machine’s disk. I just checked my machine and have 3 GB of NPMs on there, and 6 GB of NuGets.
Talk to a colleague, who knows, you may be able to find left-pad again, upload it to your private repository and be done with it.
In one of our production systems, we’re using Azure Websites to host a back-end web API. It runs on several machines and benefits from the automatic load balancing we get on Azure Websites. When going through request logs, however, we discovered that of
these several machines a few were getting a lot of traffic, some got less and one even only got hit by our monitoring system and no other traffic. That sucks!
In our back-end web API we’re not using any session state or other techniques where we’d expect the same client to always end up on the same server. Ideally, we want round-robin load balancing, distributing traffic across machines as much as possible. How
to do this with Azure Websites?
How load balancing in Azure Websites works
Flashback to 2013. Calvin Keaton did a TechEd session titled “Windows Azure Web Sites: An Architecture and Technical Deep Dive” (watch it here). In this session (around
51:18), he explains what Azure Websites architecture looks like. The interesting part is the load balancing: it seems there’s a boatload of reverse proxies that handle load balancing at the HTTP(S) level, using
IIS Application Request Routing (ARR, like a pirate).
In short: when a request comes in, ARR makes the request with the actual web server. Right before sending a response to the client, ARR slaps a “session affinity cookie” on the response which it uses on subsequent requests to direct that specific users requests
back to the same server. You may have seen this cookie in action when using Fiddler on an Azure Website – look for
ARRAffinity in cookies.
Disabling Application Request Routing session affinity via a header
By default, it seems ARR does try to map a specific client to a specific server. That’s good for some web apps, but in our back-end web API we’d rather not have this feature enabled. Turns out this is possible: when Application Request Routing 3.0 was released,
magic header was added to achieve this.
release blog post:
The special response header is Arr-Disable-Session-Affinity and the application would set the value of the header to be either
True or False. If the value of the header is true, ARR would not set the affinity cookie when responding to the client request. In such a situation, subsequent requests from the client would not have the affinity cookie in
them, and so ARR would route that request to the backend servers based on the load balance algorithm.
Aha! And indeed: after adding the following to our Web.config, load balancing seems better for our scenario:
Code highlighting produced by Actipro CodeHighlighter (freeware)
--><?xml version="1.0" encoding="utf-8"?>
<add name="Arr-Disable-Session-Affinity" value="true" />
<!-- Code inserted with Steve Dunn's Windows Live Writer Code Formatter Plugin. http://dunnhq.com -->
Disclaimer: I’m an Azure geezer and may have misnamed Azure App Service Web Apps as “Azure Websites” throughout this blog post.