goupandneverstop

Amazon’s innovation culture

At re:Invent one can only admire the amount of new innovations that AWS publishes. There were 339 announcements before the event even started. I had change to attend some sessions at this years re:Invent to get better understanding of Amazons innovation culture.

It all starts with the customer and finally ends in one. At Amazon everything stems from the company’s mission; “To be earth’s most customer-centric company”. Start by identifying what the customer needs and innovate solutions from there. Jeff Bezos has stated in letter to shareholders that “customers are always beautifully, wonderfully dissatisfied”. It is the same thing that Henry Ford said many years ago, customers do not know what they want. Amazon is not trying to build faster horse, but something that will delight customers needs. Amazon is after minimum lovable product (MLP). 

Innovation needs structure and at Amazon innovation organises around four concepts that interconnect. They are culture, mechanism, architecture and organisation. Culture builds around 14 leadership principles and these are the core of innovation. Principles are guidelines to help people bring the best out of them, they challenge everyone to be curious, take responsibility, challenge status quo, move fast and think long term. Amazon uses data and makes calculated risk, but speed matters so they have some guidelines to help this. Some decision can be considered one-way door as others two-way. Decision can be seen as two-way door if it can be reversed, then risk is lower, and one should move with speed. It is also interesting to hear, that Amazon pushes leaders to make decision when they have only around 70% of needed data.

Amazon does 198 million deployments a year

Architecture of Amazon allows rapid development. Amazon architecture went from one monolith to micro services and this allowed them to move from quarterly change cycle to 198 million deployments a year. The transformation from monolith to micro service was based on idea of Conway’s law that states “organisations which design systems … are constrained to produce designs which are copies of the communication structures of these organisations.” Amazon mapped the communication structured of their organisation and came up with two-pizza-teams (American pizza) that are made up of 4-8 people. These decentralised teams have freedom and responsibility and are able to fail fast. I had change to talk to many AWS employees and to my surprise they all had same feeling about working for Amazon. To them it felt like working for a startup. This is somewhat amazing when we are talking about company with 750 000 employees. 

Mechanism to innovate is the backwards working process. Backwards process is a very cumbersome process that starts with identifying customer and customer need. To understand customer, there are five questions. Who is the customer, what is the problem or opportunity, what is the benefit, how to quantify this, what does the experience look like? Customer is always a person. Once the customer and the need have been clearly understood and validated with data, one can start brainstorming boundless, big and bold ideas. After this it is time to draft fictional press release, internal and external FAQ’s and finally visual of the customer experience. This press release is a way to democratise the idea, so that all people would have similar opportunity to share their idea. All meetings around the idea start by all members reading the paper and iterating over it. All this is done by the team created around the idea, before a single line of code is written. This is heavy process, but its purpose is to make sure that what you build is for customers need. 

It was said that out of press release drafted by Andy Jassy, iterated 45 times, came AWS. Also, Amazon prime was built from press release. There have of course been mistakes, such as Amazon Fire phone, lessons learned, and that was base to produce Alexa. There is no magic in this, just hard work. To build something for the customer, one has to start with the customer.

AWS Redshift breaks bond between compute and storage

AWS Redshift took a huge leap forwards with new releases. Decoupling the storage and compute are the first steps towards modern cloud data warehouse.

AWS Redshift is the world’s most popular data warehouse, but has faced some tough competition from the market. AWS Redshift has the compute and storage coupled, meaning that with the specific amount of instance you get set of storage that sometimes can be limiting. At the Andy Jassy keynote AWS released a new managed storage model for Redshift that allows you to scale the compute decoupled from the storage.

The storage model uses SSDs and S3 for the storage behind the scenes and is utilising architectural improvements of the infrastructure. This allows to users to keep the hot data in SSD and also query historical data stored in S3 seamlessly from Redshift. On top of this, you only pay for the SSD you use. It also comes with a new Nitro based compute instances. In Ireland RA3 instance has price of $15.578 per node/hour, but you get 48 vCPUs and 384 GB of memory and up to 64 TB of storage. You can cluster these up to 128 instances. AWS promises to give 3x the performance of any other cloud data warehouse service and Redshift Dense Storage (DS2) users are promised to get twice the performance and twice the storage at the same cost. RA3 is available now in Europe in EU (Frankfurt), EU (Ireland), and EU (London).

Related to the decoupling of the compute and storage, AWS released AWS AQUA. Advanced Query Accelerator promises 10 times better query performance. AQUA sits on top of S3 and is Redshift compatible. For this feature we have to wait for mid 2020 to get hands on. 

While AWS Redshift is the world’s most popular data warehouse, it is not practical to load all kind of data there. Sometimes data lakes are more suitable places for data, especially for unstructured data. Amazon S3 is the most popular choice for cloud data lakes. New Redshift features help to tie structured and unstructured data together to enable even better and more comprehensive insight.

With Federated Query feature (preview), it is now possible to query data in Amazon RDS PostgreSQL, and Amazon Aurora PostgreSQL from a Redshift cluster. The queried data can then be combined with data in the Redshift cluster, and Amazon S3. Federated queries allows data ingestion into Redshift, without any other ETL tool, by extracting data from above-mentioned data sources, transforming it on the fly, and loading data into Redshift. Data can also be uploaded from Redshift to S3 in Apache Parquet format using Data Lake Export feature. With this feature you are able to build some nice lifecycle features into your design. 

“One should use the best tool for the job”, reminded Andy Jassy at the keynote. With long awaited decoupling of storage and compute and big improvements to the core, Redshifts took a huge leap forward. It is extremely interesting to start designing new solutions with these features.