There is something to be learned in how children approach learning. They possess a deep curiosity and drive to classify and identify things. Much like our five senses help us to do that (it looks like this, but it feels like that and smells like…), there are questions that children are taught to ask in the investigative process. Maybe you remember them: who, what, where, when, why, and how.
There is a reason those basics have stuck around: they are the questions we need answered to get the heart of an issue. And trust me, simple as they are, they work for data architecture and data management, too.
Visualize This: A Trade
I’m going to use a financial example but keep in mind this is applicable across domains. So, let’s look a single, ‘relatively’ simple trade: John Doe buys 100 shares of Widget Company from Jane Doe Brokerage House, it seems like we don’t have much to investigate, right?
Wrong. A simple trade from a data maintenance and management standpoint isn’t necessarily that simple. Look at the below fields which could (read: mostly do) exist and the varying / dizzying views of it:
Put into our “investigative table format”, it looks like this:
Now, that was just a tiny snapshot, but the reason I wanted to show it is that:
- Every data point no matter how large or small has a who, a what, a where, a when, a why, and a how; yes, I’ve just anthropomorphized data
- Understanding those views is how we turn data into information
- A view represents a “truth” or an end-state to an investigative process
As you “shift” the view (from the Trade to the Salesperson to the Region), four things happen:
- New data points appear – they are specific to the view
- Persistence of some data points – they’re shared objects of the same definition and meaning
- Adjustment of some data points – related objects of similar definition and meaning (e.g. counterparty to client)
- Disappearance / discardment of some data points – they still “exist” in the data ether but are not important and/or relevant to the specific view
Data and Biological Morphology
Biological morphology is concerned with the classification of organisms both at an external level of coherence and consistency and the internal (genetic) level. There are a number of things that come out of this area when it comes to the species that populate our planet:
- Externally similar and/or identical; genetically distinct and isolated
- Externally different; genetically closely related
- Varying levels of overlap
- Evolutionary convergence / divergence
- A solid “coherence” externally and internally
More often than not, our data states, as I refer to the “active” data architecture in an organization, look just like the above. We have some things that are fully “coherent”, we have data that is some mish-mash, due to the steady increase of data volumes and types and just the general “expansion” of chaos things start to converge and/or diverge “organically”, and then we have data that looks like a duck, walks like a duck, talks like a duck, yet somehow is an elephant.
Data acts like an organism over time and we need to treat it as such. While we have a measure of ability to predict the direction some of our structures and forms will grow in, there are no guarantees, and “great leaps forward” that are unprecedented and unpredictable can and will occur.
The only way to adjust and expand our overall data universe to accept and process those transformations, great and small, is to revert back to the basics:
- Who is this data for? Who created it, owns it, sees it, cares about it?
- What is this data?
- Where, either systematically or physically, does this data live?
- When, or what is the time component, to this data?
- Why is it here – the purpose of this data is…?
- How do we create it, maintain it, use it, dispose of it, do anything with it? What are the ways and means and matters of this data?
Please share this with a friend or colleague who you know would benefit from it and follow me on LinkedIn where I post other items similar to this.