c. scott andreas

Distributed systems, core data infrastructure, and analytics in San Francisco and Cupertino, CA.


Around the web –

Personal blog: blog.paradoxica.net
GitHub: github.com/cscotta
LinkedIn: linkedin.com/in/cscotta
Twitter: twitter.com/cscotta

Selected Articles

Here are some links to a few articles I've written and talks I've offered.


Articles

Some Thoughts on "No-Ops"

Feb 2, 2012
After the neologism “NoOps” made its way around the web, AppFog took a stab at reclaiming the term following its rejection by developers and operations teams alike. I originally posted this as a comment on a link aggregator, but the original post has been deleted so I’ve moved it here to give it a home.

Ordasity: Building Stateful Clustered Services on the JVM

October 20, 2011
Ordasity is a new library from Boundary designed to make building and deploying reliable clustered services on the JVM as straightforward as possible. It’s written in Scala and uses ZooKeeper for coordination. This article describes its use and implementation.

A Day in the Life of a Mobile Device: IP Connectivity

July 7, 2011
An investigation into the quality of mobile data networks via log analysis of the connection activity of Android devices to a messaging system.

C550k in Action at Urban Airship

August 24, 2010
A description of the challenges involved and approaches to designing a system capable of serving half a million concurrently-connected clients per server.

Conferences and Talks

Here are some links to a few talks I've offered.


Talks

Searching for Truth in Distributed Applications: A Look at the Network

Upcoming at OSCON 2012 in Portland, OR
This talk offers a deep-dive into how application-level problems manifest at the network level. Some of these cases range from basic network partitions and node outages to sophisticated application-level changes such as garbage collection-induced pauses, classes of bugs which evade conventional monitoring but constitute partial failures, changes in network activity based on database partitioning, load balancing, and sharding, and other warning signs that crop up at layer three long before wreaking havoc at layer seven as customer-visible failures begin to occur. Combining application-level metrics with network analytics is a powerful cocktail for identifying hot spots quickly, and connecting the dots out to the client closes the whole loop.

Designing Stateful Distributed Applications

Proposed - TBA
Faced with unprecedented growth and equally demanding calls for reliability and predictability, we as engineers find ourselves called to develop stable distributed applications with solid scalability characteristics and seamless failure modes – and to get them into production by yesterday. While some applications can be designed as stateless, shared-nothing systems, others (such as databases, caches, stream processing engines, and other stateful systems) require predictable computation and a more complex distribution story. This talk provides an overview of popular distributed application design strategies (Dynamo, master / slave, and centrally-coordinated but self-organizing systems), load balancing techniques, warm handoff and rebalancing, and clean handling of failures.

Garbage, Garbage Everywhere

November 17, 2011
This talk from a Boundary Pizza, Beer, & Tech Talks meetup includes a walkthrough of Boundary's stream processing infrastructure and garbage collection strategies for pushing the bounds of JVM throughput.

Designing + Implementing Asynchronous Distributed Systems: Challenges, Strategies, and a Million Things That Go Wrong

OSCON 2011
A language-agnostic discussion focusing instead upon concepts and strategies applicable to many programming languages with specific examples in static languages like Java/Scala, conventional dynamic languages such as Python/Ruby, and emerging platforms such as Node.js.

Open Source Projects

Here are a few links to some open source projects I actively maintain or have worked on:

Ordasity

Author

Ordasity is a library designed to make building and deploying reliable clustered services on the JVM as straightforward as possible. It's written in Scala and uses ZooKeeper for coordination. Ordasity's simplicity and flexibility allows us to quickly write, deploy, and (most importantly) operate distributed systems on the JVM without duplicating distributed "glue" code or revisiting complex reasoning about distribution strategies.

Octobot

Author

Octobot is a task queue worker designed for reliability, ease of use, and throughput on the JVM. Octobot can listen on any number of queues, with any number of workers processing messages from each.

Scalang

Committer

Scalang is a message passing and actor library that allows Scala and Erlang applications to easily communicate. Scalang includes a full implementation of the Erlang distributed node protocol. It provides an actor oriented API that can be used to interact with Erlang nodes in an idiomatic, OTP compliant way. Scalang is built on Netty for its networking layer and Jetlang for its actor implementation.

Overlock

Committer

Overlock is a concurrency utility library for Scala with clean wrappers atop the non-blocking atomic data structures in Cliff Click's high-scale-lib.

Miscellaneous Projects

Contributor

I've also contributed code and changes to projects such as Apache Kafka, Apache Cassandra, and Twitter's "Commons" library for the JVM.

Get in Touch

Email: Shoot me an e-mail at scott [at] paradoxica.net.

IRC: Sometimes I hang out on freenode as "cscotta."

Phone: Please ask. :-)