Photo credit: Andrei Suslov // Getty Images
More than ten years ago, Marc Andreesen published his famous “Why Software Is Eating the World” magazine on Wall Street. He explains why software companies are taking over the industry from an investor’s perspective.
As the founder of a company that enables GraphQL at the edge, I want to share my perspective on why I believe the edge is actually eating the world. Based on observation and first-principles reasoning, we’ll quickly look back at the past, look back at the present, and dare to peer into the future.
let’s start.
Web applications have used the client-server model for over forty years. A client sends a request to a server running a web server program and returns the content of the web application. Both client and server are just computers connected to the internet.
MetaBeat 2022
MetaBeat will Gathering thought leaders in San Francisco on October 4 to provide guidance on how Metaverse technologies are changing the way all industries communicate and do business, CA.
REGISTER HERE
In 1998, five MIT students observed this and had a simple idea: let’s distribute files to many Data centers, working with telecom providers to utilize their networks. The idea of ​​a so-called Content Delivery Network (CDN) was born.
CDNs are starting to store not just images, but video files and any data you can imagine. By the way, these points of presence (PoP) are edges. They are servers distributed across the globe—sometimes hundreds or thousands of servers whose entire purpose is to store copies of frequently accessed data.
While the initial focus was on providing the right infrastructure and “getting it working”, these CDNs have been difficult to use for years. The developer experience (DX) revolution in CDNs started in 2014. Instead of manually uploading your website’s files and then having to connect it with the CDN, package the two parts together. Services such as Surge.sh, Netlify, and Vercel (fka Now) started appearing.
Distributing static website assets via CDN is now an absolute industry standard.
Okay, now we move static assets to the edge. But what about calculations? What about dynamic data stored in a database? Can we also reduce latency by making it closer to the user? If, then, how?
Welcome to the edge
Let’s look at two sides of the edge:
1. Calculate
and
2. data.
In both areas, we are seeing incredible innovation that will revolutionize the way applications work in the future.
To calculate, we have to
What if incoming HTTP requests didn’t have to go all the way to data centers that lived far, far away? What if it could be provided directly next to the user? Welcome to edge computing.
The further we move from one centralized data center to multiple decentralized data centers, the more we need to deal with a new set of tradeoffs.
On the edge, you can’t scale a powerful machine with hundreds of gigabytes of RAM for your application, but you don’t have that luxury. Imagine you want your application to run in 500 edge locations, all close to your users. Buying a powerful machine 500 times is not worth it at all. This is simply too expensive. Options are for smaller, more minimal setups.
Serverless is an architectural pattern that fits these constraints well. You don’t need to host the machine yourself, you just write a function which is then executed by the intelligent system when needed. You no longer need to worry about the abstraction of a single server: you just write functions that are runnable and basically infinitely scalable.
As you can imagine, these functions should be small and fast. How can we do this? What is a good runtime for those fast and small functions?
Currently, there are two popular answers in the industry: use JavaScript V8 isolation or use WebAssembly (WASM).
The JavaScript V8 isolation popularized in Cloudflare Workers allows you to run a full JavaScript engine at the edge. When Cloudflare introduced workers in 2017, they were the first to offer this new simplified computing model for the edge.
Since then, various providers including Stackpath, Fastly and our good ol’ Akamai have also released their edge computing platforms – a new The revolution has begun.
WebAssembly is an alternative computing model to the V8 JavaScript engine that is ideal for edge computing. WebAssembly, which first appeared in 2017, is a fast-growing technology that big companies like Mozilla, Amazon, Arm, Google, Microsoft, and Intel are investing heavily in. It allows you to write code in any language and compile it into a portable binary that can run anywhere, be it in a browser or various server environments.
WebAssembly is without a doubt one of the most important developments on the web in the past 20 years. It already powers chess engines and design tools in the browser, runs on the blockchain, and will likely replace Docker.
data
While we already have some edge computing products, the biggest obstacle to the success of the edge revolution is Bring data to the edge . If your data is still in a far-flung data center, you’ll gain nothing by moving your computer next to your users—your data is still the bottleneck. There is also no way to find a solution for distributing data in order to fulfill the main promise of the edge and speed things up for users.
You might be wondering, “Can we not just copy data from all over the world to our 500 data centers and make sure it’s up to date?”
In a world like Litestream, which recently joined fly.io, unfortunately, it’s not easy. Imagine you have 100TB of data that needs to run in a sharded cluster of multiple machines. It’s simply not economical to replicate that data 500 times.
Some way is needed to still be able to store large amounts of data while bringing it to the edge.
In other words, with resource constraints, how can we distribute our data in an intelligent, efficient way so that we can still get it quickly at the edge data?
In this resource-constrained situation, the industry has been using (and has been using for decades) two approaches: sharding and caching.
To shard or not to shard
In sharding, you split the data into multiple datasets according to certain criteria. For example, choose a user’s country as a way to split your data so that you can store that data in different geographic locations.
It is very challenging to implement a common sharding framework that works for all applications. A lot of research has happened in this area over the past few years. For example, Facebook proposed a sharding framework called Shard Manager, but even this only worked under certain conditions and required many researchers to make it work. We’ll still see a lot of innovation in this space, but it’s not the only solution to bringing data to the edge.
Cache is King
The other method is caching. Instead of storing all the 100TB database at the edge, I could set a limit like 1GB and only store the most frequently accessed data. Keeping only the most popular data is a well-known problem in computer science, and the LRU (Least Recently Used) algorithm is one of the best-known solutions here.
You might be asking, “Why don’t we all use LRU caching for data at the edge and call it quits?”
Well, no so fast. We want the data to be correct and fresh: Ultimately, we want the data to be consistent. But wait! When it comes to data consistency, you have a range of advantages: from the weakest consistency or “eventual consistency” all the way to “strong consistency”. There are also many layers in the middle, namely “Read my own write Consistency”.
The edge is a distributed system. When processing data in a distributed system, the laws of the CAP theorem apply. The idea is that if you want your data to be highly consistent, you will need to make tradeoffs. In other words, when new data is written, you never want to see old data again.
Such strong consistency in a global setting can only be achieved if the different parts of the distributed system agree at least once on what just happened. This means that if you have a globally distributed database, it still needs to send at least one message to every other data center around the world, which introduces unavoidable delays. Even FaunaDB, a great new SQL database, can’t get around this fact. Honestly, there is no free lunch: if you want strong consistency, you need to accept that it contains some latency overhead.
Now you might ask, “But do we always need strong consistency?” The answer is: it depends. There are many applications that do not require strong consistency to run. One of them is, for example, this tiny online store you’ve probably heard of: Amazon.
Amazon created a database called DynamoDB, which operates as a distributed system with extremely high scale capabilities. However, it is not always completely consistent. While they make it “as consistent as possible” using many of the smart tricks explained here, DynamoDB doesn’t guarantee strong consistency.
I believe that an entire generation of applications can run just fine with eventual consistency. In fact, you may already have some use cases in mind: social media feeds are sometimes slightly outdated, but usually fast and usable. Blogs and newspapers offer a delay of milliseconds or even seconds for published articles. As you can see, eventual consistency is acceptable in many cases.
Assuming we’re fine with eventual consistency: what do we get out of it? This means we don’t need to wait until the changes are confirmed. With this, we no longer have latency overhead when distributing data globally.
However, getting “good” eventual consistency is not an easy task. You need to deal with this little problem called “cache invalidation”. When the underlying data changes, the cache needs to be updated. Yes, you guessed it: this is an extremely difficult problem. So hard that it has become a joke in the computer science community.
Why is it so hard? You need to keep track of all data that is cached and need to properly invalidate or update as soon as the underlying data source changes. Sometimes you don’t even have control over the underlying data source. For example, imagine using an external API like the Stripe API. You need to build a custom solution to invalidate that data.
In short, that’s why we built Stellate to make this tough problem more bearable and even fixable by equipping developers with the right tools. If the strongly typed API protocol and schema GraphQL didn’t exist, let’s be frank: we wouldn’t have created this company. Only strong constraints can
I believe both will be better suited to these new needs, and no single company can “solve the data”, but we requires an industry-wide effort to solve this problem .
There is still a lot to say on this topic, but for now, I think the future of this field is bright and I’m excited about what’s to come.
The future: right here, right In the present
with all the technological advancements and limitations listed, let’s look to the future. It would be presumptuous to do this without mentioning Kevin Kelly.
At the same time, I admit that we cannot predict where our technological revolution will go, nor which specific products or companies will lead and win in this space 25 years from now. We may have entirely new companies leading the way, not even created yet.
However, we can predict some trends because they are already happening now. In his 2016 book The Inevitable, Kevin Kelly discusses 12 technological forces shaping our future. Like his title, here are eight powers:
Cognifying : the cognition of things, aka Make things smarter. This will require more and more computing directly where it is needed. For example, classifying roads for self-driving cars in the cloud is not realistic, right?
Flow : We will have the more A real-time stream of information that more and more people rely on. It could also be critical for latency: let’s imagine controlling a robot to complete a task. You don’t want to route control signals halfway across the globe if it’s not necessary. However, a constant stream of information, chat apps, real-time dashboards, or online gaming isn’t critical to latency, so leverage the edge.
Filter:
More and more things in our lives will get screens. From smartwatches to refrigerators and even your digital scale. In this way, these devices will be constantly connected to the Internet, forming a new generation of edge.
share:
The growth of large-scale cooperation is inevitable. Imagine you are working on documents with friends sitting in the same city. So why send all this data back to a data center on the other side of the world? Why not put the file next to the two of you?
filter:
We will use strong personalization in order to anticipate our desires. This may actually be one of the biggest drivers of edge computing. Since personalization is about a person or a group, it’s a perfect use case for running edge computing next to them. It will speed things up, milliseconds equal profit. We’ve seen this applied in social networking, but it’s also getting more adoption in e-commerce.
Interaction:
Let us be more immersed, more in our computer with maximum In order to increase engagement, this immersion will inevitably be personalized and run directly or very close to the user’s device.
Tracking:
Here comes the big brother. We’re going to be tracked more, it’s unstoppable. More sensors will collect a lot of data. This data cannot always be transmitted to a central data center. Therefore, real-world applications need to make fast real-time decisions.
start:
Ironically, last but not least, is the “start” factor. The past 25 years have been an important platform. However, let’s not get our hopes up on the trends we see. Let’s embrace them so we can create the greatest good. Not just for us developers, but for all of humanity. I predict that in the next 25 years, shit will become a reality. That’s why I say edge caching is eating the world.
As I mentioned before, the problems we programmers face will not be the responsibility of one company, but will require the help of our entire industry. Want to help us solve this problem? Just to say hi? Reach out anytime.
Tim Suchanek is Stellate’s CTO .
Data Decision Maker
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including technologists working with data, can share data-related insights and innovations.
If you want to learn about cutting-edge thinking and the latest information, best practices, and the future of data and data technology, join us at DataDecisionMakers.
You may even consider publishing your own article!
Read more from DataDecisionMakers
share:
filter:
Interaction:
Let us be more immersed, more in our computer with maximum In order to increase engagement, this immersion will inevitably be personalized and run directly or very close to the user’s device.
Tracking:
Here comes the big brother. We’re going to be tracked more, it’s unstoppable. More sensors will collect a lot of data. This data cannot always be transmitted to a central data center. Therefore, real-world applications need to make fast real-time decisions.
start:
Ironically, last but not least, is the “start” factor. The past 25 years have been an important platform. However, let’s not get our hopes up on the trends we see. Let’s embrace them so we can create the greatest good. Not just for us developers, but for all of humanity. I predict that in the next 25 years, shit will become a reality. That’s why I say edge caching is eating the world.
As I mentioned before, the problems we programmers face will not be the responsibility of one company, but will require the help of our entire industry. Want to help us solve this problem? Just to say hi? Reach out anytime.
Tim Suchanek is Stellate’s CTO .
Data Decision Maker
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including technologists working with data, can share data-related insights and innovations.
If you want to learn about cutting-edge thinking and the latest information, best practices, and the future of data and data technology, join us at DataDecisionMakers.
You may even consider publishing your own article!
Read more from DataDecisionMakers
start:
As I mentioned before, the problems we programmers face will not be the responsibility of one company, but will require the help of our entire industry. Want to help us solve this problem? Just to say hi? Reach out anytime.
Tim Suchanek is Stellate’s CTO .
Data Decision Maker
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including technologists working with data, can share data-related insights and innovations.
If you want to learn about cutting-edge thinking and the latest information, best practices, and the future of data and data technology, join us at DataDecisionMakers.
You may even consider publishing your own article!
Read more from DataDecisionMakers