Big Data

Wonder Plumbers, Why Data Engineers Are The Unsung Heroes

Be a coder! The drive to enthuse students (and anyone keen to learn) throughout the computer science industry is dominated by messaging designed to encourage people to gain certification and skills in software application development. Front-line and now finally chic-to-be-geek, developers are always in the spotlight (for good and bad reasons) and generally regarded to be the stars of the technology world in the neo-cloud era of AI where the promise of quantum hangs close to come.

But developers are only half, if half, of the equation.

We only need to think about the rise of DevOps and the need to recognize not just developers, but operations teams as well in order to understand why the Ops crew is so fundamental. An unsung gang of heroes made up of database administrations, system administrators (aka sysadmins), site reliability engineers and every other subgenre of operations all the way through to penetration testers and other security professionals, the operations function matters.

Today then, in the age of AI and the rise of the so-called data-driven enterprise, a new Ops hero rises, the data engineer.

Precision Plumbing

While data scientists and AI specialists receive the plaudits, job titles and remuneration of innovators, data engineers have often been relegated to the sidelines, seen as the “plumbers” of the data world i.e. an essential utility function, but ranked somewhere along with sanitization engineers (trash collectors) and other cleaning and maintenance staff.

“Data engineers – and their counterparts in database administration – are the secret weapon of any data-driven enterprise, responsible for building the infrastructure and quality control that makes data-driven decision-making possible. Unlike plumbing, this identity crisis doesn’t hold water,” said Mark Molyneux, EMEA CTO, Cohesity. “As artificial intelligence and machine learning become increasingly adopted in enterprises, the distinction between data engineers and data scientists is blurring. In reality, data engineers and administrators have skills critical in automation: understanding structured and unstructured data formats, quality, compliance, security, classification and orchestration – the very foundations of successful automation.”

Molyneux tables the proposal to champion these unsung plumbing heroes in the context of the age-old “garbage-in, garbage-out” maxim that governs the core need for data quality and control in modern enterprise technology stacks. He says that the responsibility for clean, normalized, reliable data has always fallen onto data engineers ensure quality control and construct.

“Their work is far from mundane and requires meticulous attention and deep knowledge of coding and data structures. It takes on challenges with structured data and the growing volumes of unstructured data to decide how it is collected, stored, accessed and referenced,” underlined Molyneux, speaking to press this month in London.

Guardian Gatekeepers

He further clarified this story by talking about security and governance i.e. data engineers also play a pivotal role as gatekeepers, enforcing data security, data privacy, accurate access control and appropriate encryption. In this role, they ensure that their organisations can meet GDPR, the California Consumer Privacy Act and other regulatory requirements. “Many data engineers will be competent coders and write complex SQL queries, build data models and apply machine learning techniques in their pipelines.”

“Data engineers drive automation across the data lifecycle and their work in DevOps allows enterprises to scale their AI capabilities without bottlenecks. This efficiency enables businesses to deploy AI models faster and ensure continuous delivery of insights,” said Molyneux.

He further points to tasks including classification and analytics i.e. ensuring data is stored according to clear rules and guidelines allows the data also to be classified such that it can be correctly stored on the right medium and is easily referenceable. This enables far more accurate security and access control as the data is identifiable, plus it’s simple to report on data compliance for auditors or regulators. Their work sets the stage for data analytics by creating a framework to extract, store and interrogate data. This leans heavily into trusted and responsible data governance, widely agreed to be a core tenant for future technologies such as AI to be successful.

“The original role of data engineers was to create a location for data to be held in a queryable format. The function has grown and evolved in parallel with expecting more from the data, but everything starts with needing to understand the structure of the data itself. This is where – excuse the pun – data administrators excel,” concluded Cohesity’s Molyneux.

Leaving us with some ideas to ponder on, Molyneux has suggested that as the volume of data grows and is increasingly automated and analyzed, the data engineer’s role will expand further. He absolutely insists (in a positive way) the identity crisis of data engineers needs to end – they are not sluice pipes plumber or simple builders following plans.

We may just find that – just as developers eventually became fashionable – that the core operations function and the data engineering discipline is ultimately viewed much the same way we regard nurses today i.e. no less vital than doctors and often even more wonderful. When the inevitable levelling out comes, a new meritocracy will surface and we will start to realize that we really did love our data engineers after all. Until then, be nice to everyone, just in case.

Was this helpful ?
YesNo

Adnen Hamouda

Software and web developer, network engineer, and tech blogger passionate about exploring the latest technologies and sharing insights with the community.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

The reCAPTCHA verification period has expired. Please reload the page.

Back to top button