I recently returned from some off-the-grid hiking in the beautiful North Cascades National Park. It was a perfect time to practice some landscape photography and share these memories with friends and family. This blog post uses the example of sharing vacation photos to explore the concept of Edge Computing and help readers find ways to apply these concepts to modernize IT and answer why edge computing is important.
When hiking, it's important to hike in groups and bring only the essentials to minimize weight and extend distance. While my group has a strict "no laptops / no working policy" during vacation, we do take advantage of the modern convenience that is a smartphone. In this case, we treat our smartphones a bit like the "glass cockpit" in a modern airplane using the map, data, and GPS to help us navigate most efficiently. We keep some paper backups and use a lot of digital redundancy to ensure the failure of one device doesn't disrupt our journey. Travel logistics, backcountry food plans, trail maps, and everything we could plan took place first on cloud services (Google Workspace) for convenience among the group. When the trip began, we took cached copies onto our smartphones so these details were accessible to us off-network. The smartphone, that ubiquitous and rugged-enough edge computing device, became our tether to all things data while hiking through the wilderness.
Before we get into why edge computing is important, this is a good time to define "Edge Computing" which simply refers to bringing computing resources as close to the source of data and users as possible. This is in contrast to "Cloud Computing" where large centralized computer centers provide depth of capabilities but aren't located near users.
Why is edge computing important even if it has limited utility? Because it is implemented with a centralized backend to accommodate data sharing, provide long-term storage, and facilitates deeper analysis. In our case, without network connectivity in wilderness areas, our edge computing devices could only record data to local storage. We had limited ability to share or fully analyze the data and a fixed quantity of storage capacity. Critically though, we had an equipment form factor to effectively use our maps and capture new data (photos, path tracking, etc). Industrial and commercial applications of edge computing may not be as small as a smartphone, but are still engineered to offload some data processing needs to remote systems.
Recommendation 1: Tailor edge computing to sunny-day and rainy-day scenarios and expect network failure as inevitable. Ensure sufficient buffering of services and local capability within your operational disruption error budget.
When we consider network connectivity gaps and system failure as being inevitable, we can design the edge computing environment to accommodate them. This is why edge computing is important.
-For a photographer this means "bring enough storage" in the form of additional memory cards. The design priority is continuing to record new data. A delay in sharing photos or post-processing edits using additional remote resources is tolerable until the network is reachable.
-For a factory process control computer, a network disruption means running factory machines in a fail-safe mode and recording activity data locally. Certainly, the safety mechanisms are the priority and must function without the need to receive instructions from remote resources. However, a delay in processing output data or receiving order shipping information is tolerable. Sufficient safety and data capture is the priority in this example.
Why is edge computing important to make all this happen? Because the powers and elegance of this edge and cloud computing for sharing photos became truly apparent at the end of the trip when we were all awaiting departure at the airport. Once reconnected to LTE and wifi, our photos and fitness data are automatically synchronized to cloud storage. Secure access is granted to the group members and extended as read-only to social networks. There is enough computing, storage, and networking access through the smartphone that I no longer have to wait until I get home (or to a photo lab) to process and upload photos to share. Millions of photos are shared securely this way every day and the same can happen to any other digital information your business runs on.
Recommendation 2: Edge computing is also important because it will ensure distributed system recovery automatically and efficiently.
-For a photographer, this means creating offsite backup copies of photos automatically when network access is detected. This ensures the loss or failure of an individual device doesn't affect the rest of the data.
-For a factory producing hard goods, this means that output data is triggered to upload when network connectivity is restored. This eliminates the need for manual data entry during the period of downtime and allows checking for fresh data from central systems such as newly added orders and shipping information.
A now famous example of why Edge Computing is important is because the distributed edge computing model is currently implemented among Tesla's connected cars and autopilot software. In the linked article, a network failure was noted as causing users to be unable to unlock their cars using their phones. In Tesla's design choice, some convenience features such as using the phone as a key and feature software updates are unavailable if network connectivity is lost. However, the core functions of maintaining occupant safety, manual control of vehicle dynamics, and manual ingress/egress procedures are preserved. Tesla provides users with a regular key to serve the same function as a paper map while on a hiking trail, to backup the core function should the more convenient option cease to work.
Whether it's a critical financial analysis spreadsheet, CAD drawing, or media publication we find users expect to access data from anywhere at any time to stay productive. That’s why edge computing has so much importance in the modern era? Because today we use cloud identities to authenticate and securely access data from anywhere, not just a specific place in the network like a VPN or office computer. Today we can use the limitless capacity of cloud storage to publish endless fidelity of data to meet our user's needs wherever they are. Today we can provide users with powerful edge computing devices with the capacity and capability to maintain productivity throughout system and network disruption. We can use these edge and cloud computing concepts to make sharing business data as easy as sharing on social media and with failsafe operating and endpoint security to protect ourselves.
All of this is why edge computing is important in today’s world. What's stopping your business from using smart edge devices like smartphones to empower users and present your business data securely while on the go?
Lucid Point Sr. Cloud Architect
Why do you need a data warehouse? You’ve already got a database, and it has all of your business information inside it. You’ve been getting reports from this database for years, why incur the additional expense of a data warehouse too? What is a data warehouse and its difference to a database?
A database, to most, is a repository of information from an application, typically a single application but sometimes from many. This database is designed to make transactional systems run efficiently and is typically an OLTP (online transaction processing) database. It allows concurrent access to the data in real-time in a secure manner while maintaining integrity, reducing redundancy, and restricting access to prohibited data.
So, if you were to ask us if what is a data warehouse, it is another layer added to an existing database or databases. It is designed to effectively and efficiently perform analytical requests on the data. A data warehouse imports data from one or more source datasets into its data structure. Regardless of the internal designs and processes underneath the data warehouse, it can now be used to run complex and multidimensional queries on the data without affecting the production data environment, and without being affected by such. These queries and reports can also run more quickly due to the way the data is structured, and because a data warehouse can accept data from multiple sources, analysis and reports can be run against a broad range of data, such as sales, and customer support, and application. Additionally, what a data warehouse is used for is storing historical data for longer periods of time while still being available for comparative analysis. This is especially true when dealing with trending data, where you don’t need the historical information or metadata cluttering the transactional/primary database, but which could be used to monitor growth trends. A growing use of this kind of warehouse is the use of AI/ML (Artificial Intelligence/Machine Learning) to analyze vast amounts of historical data to help make better-informed business decisions.
Alongside knowing what a data warehouse is, it is also important to note that many of the characteristics behind a transactional database do not work well with analytics. As the amount of data grows with the business, those reports used to make business decisions have been taking longer and longer to run. Likewise, a data warehouse does not work as a primary database as its data is not easily amenable to rapid or atomic change. In fact, one of the more common ways to utilize a data warehouse is with transient data, where data sources are loaded periodically from snapshots, analyzed, and then purged for the next batch of data. How is a data warehouse structured? The internal structure and organization of a data warehouse can differ from a traditional database, up to and including columnar structure, parallel, sharded, and clustered processing, which allows for greater processing ability. The benefits of a columnar vs row structure in a data warehouse will be covered in a future post.
The key reasons to use a data warehouse over a collection of disparate data sources are that it allows complex queries to be executed outside of the production environment, it allows related information spread across multiple sources to be analyzed together more easily, and it speeds up the analytical and reporting processes, especially when there are massive amounts of data involved. The best reason to use a data warehouse I have seen is that it off-loads the processing from the primary or production database, freeing up those resources and allowing the analysis and reporting to be run at any time. A data warehouse set in the cloud can also be spun down when not needed, and additional resources can be allocated as needed, both of which can result in significant cost savings without impacting either the production database or the data warehouse.
LucidPoint Sr. Cloud Enginee