The Promises of Enterprise Data Warehouses Fulfilled with Big Data

Remember back in the 1990s/2000s Data Warehouses were all the rage.    The idea was to take data from all the transactional databases behind the multiple e-Commerce, CRM, financials, lead generation and ERP systems deployed in the company and merge them into one data platform.  It was the dream, CIOs were ponying up big dollars for these because they thought it would solve finance, sales, and marketing most significant problems.  It was even termed Enterprise Data Warehouse or EDW.  The new EDW would take 18 months to deploy as ETLs would be written from the various systems and data would have to be normalized to work within the EDW.  In some cases, the team made bad decisions about how to normalize the data causing all types of future issues.   When the project finished, there would be this beautiful new data warehouse, and no one would be using it.  The EDW needed a report writer, to make fancy reports, in a specialized tool like Cognos, Crystal Reports, Hyperion, SAS, etc.   A meeting would be called to discuss data, with 12 people and all 12 people would have different reports and numbers depending on the formulas in the report.  That lead to eventually someone from Finance who was part of the analysis, budgeting and forecasting group would learn the tool and be the go-to person and work with the team from technology assigned to create reports.

Then Big Data came along. Big data even sounds better than Enterprise Data Warehouse, and frankly given the issues back in 1990s/2000s the branding to Big Data doesn’t have the same negative connotations.

Big Data isn’t a silver bullet, but it does a lot of things right.  First and foremost the data doesn’t require normalization.  Actually normalization is discouraged.  Big Data absorbs the transactional database data, social feeds, eCommerce analytics, IoT sensor data, and a whole host of other data and puts it all in one data repository. The person from finance has been replaced with a team of data scientists who are highly trained and develop analysis models and extracts data with statistical (R programming language) and Natural Language Processing (NLP). The data scientists spend days pouring over the data, extracting information, building models, rebuilding models and looking for patterns within the data. The data could be text, voice, video, images, social feeds, transaction data and the data scientist is looking for something interesting.

Big Data has huge impacts as the benefits are immense.  However, my favorite is predictive analytics.  Predictive analytics tells you something’s behavior based on previous history and current data. It’s going to predict the future.  Predictive analysis is all over retail as you see it on sites as “Other Customers Bought” or recommending purchases based on your history.   Airlines use it to predict component failure of planes.  Investors use it to predict changes in stock, and the list of industries using it goes on and on.

The cloud is a huge player in the Big Data space Amazon, Google and Azure are offering Hadoop and Spark as services.    The best thing about the cloud is when the data is absorbed in Gigabytes or Terabytes that the cloud is providing the storage space for all this data.  Lastly given it’s in the cloud, it’s relatively easy to deploy a Big Data cluster, and hopefully,  soon AI in the cloud will replace the data scientists as well.

Remember back in the 1990s/2000s Data Warehouses were all the rage.    The idea was to take data from all the transactional databases behind the multiple e-Commerce, CRM, financials, lead generation and ERP systems deployed in the company and merge them into one data platform.  It was the dream, CIOs...

BGP Route Reflectors

Studying for the CCNP Route 300-101 Route exam, there is no discussion of Border Gateway Protocol(BGP) Route Reflectors.    It doesn’t even make the exam blueprint.  BGP Route Reflectors are one of the most important elements for multi-home, multi-location BGP.    This blog post is not going to be a lesson in BGP, as there are plenty of resources do a great job explaining the topic.   Within an Autonomous system(AS) if there are multiple BGP routers, an iBGP full mesh is required.   Its a fancy way of saying all the BGP routers need to be connected within an AS.  Let’s take an example of a large company which has Internet peering in New York, Atlanta and San Francisco.   If the large company is the same AS number, that means it has at least 3 BGP routers, and for business reasons, the routers are dual and dual homed.   That makes 6 BGP routers.  Remember the formula for a full mesh is: N(N-1)/2.   Based on the formula, it would require 15 iBGP peering connections.  iBGP makes a logical connection over TCP, but it still needs 15 configurations.   This is a small example, but it doesn’t scale if we increased to 10 routers, that means 45 iBGP connections and configurations.

What does a route reflector do?

A Route Reflector readvertise routes learn from internal peers to other internal peers.   Only the route reflector needs a full mesh with its internal routers.  The elegance of this solution is that it is a way of making iBGP hierarchical.

The previous example of 6 routers, there are many ways to organize the network with Router Reflectors.   One Cluster with two route reflectors, two clusters with two route reflectors, etc.

 The astonishing part is something so fundamental to leveraging BGP is not cover on the CCNP Routing Exam according to the exam blueprint.

Studying for the CCNP Route 300-101 Route exam, there is no discussion of Border Gateway Protocol(BGP) Route Reflectors.    It doesn’t even make the exam blueprint.  BGP Route Reflectors are one of the most important elements for multi-home, multi-location BGP.    This blog post is not going to be...

Exhaustion of IPv4 and IPv6

IPv4 exhaustion is technology’s version of chicken little and sky is failing.     The sky has been falling on this for 20+ years, as we have been warned IPv4 is exhausting since the late 1990s.   Here comes the IoT including Smart Home were supposed to strain the IPv4 space.    I don’t know about you, but I don’t want my refrigerate and smart thermostat on the internet.

However, every time I go into AWS, I can generate an IPv4 address.   Home ISP are stilling handing out static IPv4 if you are willing to pay a monthly fee.     Enterprise ISP will hand you a /28 or /29 block without to much effort.    Sure lots of companies, AWS, Google, Microsoft have properties on IPv6.   But it’s not widely adopted.   The original RFC on IPv6 was published in December of 1995.

I believe the lack of adaption is due to the complexity of the address. If my refrigerators IPv4 address is 192.168.0.33.    It’s IPv6 address is 2001:AAB4:0000:0000:0000:0000:1010:FE01 which could be shorten to  2001:AAB4::1010:FE01.   Imagine calling that into tech support or being tech support taking that call.  Why didn’t the inventors of IPv6 add octets to the existing IP address?   For instance, the address 192.168.0.33.5.101.49, would have been so much more elegant and easier to understand.     I think it will take another 15-20 years before IPv6 is widely adapted and another 50 years before IPv4 is no longer routed within networks.

IPv4 exhaustion is technology’s version of chicken little and sky is failing.     The sky has been falling on this for 20+ years, as we have been warned IPv4 is exhausting since the late 1990s.   Here comes the IoT including Smart Home were supposed to strain the IPv4 space.    I...

To The Cloud and Beyond...

I was having a conversation with an old colleague late Friday afternoon.    (Friday was a day of former colleagues, had lunch with a great mentor).   He’s responsible for infrastructure and operations for a good size company.    His team is embarking on a project to migrate to the cloud as their contract for space will be up in 2020. There three things which were interesting in the discussion which I thought were interesting and probably the same issues others face on their journey to the cloud.

The first was the concern about security.    The cloud is no less or more secure than your data center. If your data center is private your cloud asset can be private, if your need public facing services, they would be secured like the public facing services in your own data center.    Data security is your responsibility in the cloud, but the cloud doesn’t make your data any less secure.

The other concern was the movement of VMware images to the cloud.   Most of the environment was virtualized years ago.   However, there are a lot of windows 2003 and 2008 servers.    Windows 2008  end of support is  2020, and Windows 2003 has been out of support since July 2015.     It’s odd the concern about security, given the age of the Windows environment.      If it was my world, I’d probably figure out how to move those servers to Windows 2016 or retire ones no longer needed, keeping in mind OS upgrades are always dependent on the applications.   Right or wrong, my roadmap would leave Windows 2003 and 2008 in whatever datacenter facility is left behind.

Lastly, there was concern about Serverless, and the application teams wanting to leverage this over his group’s infrastructure services.   There was real concern about a loss of resources if the application teams turn towards Serverless, as his organization would have fewer servers (physical/virtual instances)  to support.  Like many technology shops, infrastructure and operations resources are formulated by the total number of servers.   I find this hugely exciting.    I would push resources from “keeping the lights on” to roles focused on growing the business and speed to market, which are the most significant benefit of serverless.   Based on this discussion, people look at it from their own prism.

I was having a conversation with an old colleague late Friday afternoon.    (Friday was a day of former colleagues, had lunch with a great mentor).   He’s responsible for infrastructure and operations for a good size company.    His team is embarking on a project to migrate to the cloud...

Power of Digital Note Taking

There hundreds of note taking apps.    My favorites are Evernote, GoodNotes, and Quip.   I’m not going to get into the benefits or pros and cons of each application.  There plenty of BLOGs, youtube videos which do this in great detail.    Here is how I used them:

  • Evernote is my document and note repository.

  • GoodNotes is for taking handwritten notes on my iPad, and the PDFs are loaded into Evernote.

  • Quip is for team collaboration and sharing notes and documents.

I’ve been digital for 4+ years.  Today, I read an ebook from Microsoft, entitled “The Innovator’s Guide to Modern Note Taking.“  I was curious as to Microsoft’s ideas on the digital note-taking.   The ebook is worth a read.    I found there three big takeaways from the ebook:

First - The ebook quotes, “average employee spends 76 hours a year looking for misplaced notes, items, and files.   In other words, we spend annual $177 billion across the U.S”.

Second - The ebook explains that the left side of the brain is used when typing on a keyboard,  and the right side of the brain is when writing notes.  The left side of the brain is more clinical, and the right side of the brain is more creative, particular asking the “What If” questions.  Also covered on page 12 of the ebook handwriting notes improves retention.  Lastly on page 13 one of my favorites as I am a doodler, “Doodlers recall on average 29% more information than non-doodlers”.   There is a substantial difference in typing vs. writing notes, and there is a great blog article from NPR if you want to learn more.

_Third - _Leverage the cloud, whether it’s to share, process, access anywhere.

Those are fundamentally the three reasons that I went all digital for notes.  As described before I write notes in GoodNotes and put them in Evernote, I use the Evernote OCR for PDFs to search them.    My workflow covers the main points described above.   Makes me think I might be ahead of a coming trend.

There hundreds of note taking apps.    My favorites are Evernote, GoodNotes, and Quip.   I’m not going to get into the benefits or pros and cons of each application.  There plenty of BLOGs, youtube videos which do this in great detail.    Here is how I used them:

...

Multi-cloud environments are going to be the most important technology investment in 2018/2019

I believe that Multi-cloud environments are going to be the most important technology investment in 2018/2019.   This will drive education and new skill development among various technology workers.  Apparently, it’s not just me, IDC prediction is that “More than 85% of Enterprise IT Organizations Will Commit to Multicloud Architectures by 2018, Driving up the Rate and Pace of Change in IT Organizations”.There some great resources online for multi-cloud, strategy, benefits, all worth reading:

The list could be hundreds of articles.   I wanted to provide a few, that I thought were interesting and relevant to this discussion of why Multi-cloud.   There are four drivers behind this trend:

First -  Containers will allow you to deploy your application anywhere, including all the major cloud players have Kubernetes, Docker support.    This means you could deploy to AWS, Azure, and Google without rewriting any code.    Application support, development, maintenance is what drives technology dollars.   Maintaining one set of code that runs anywhere doesn’t cost any more and gives you complete autonomy.

Second -  Companies like JoyentNetlify,  HashiCorp Terraform and many more are building their solutions for multi-cloud, giving the control, manageability, ease of use, etc.    Technology is like Field of Dreams, quote, “if you build it they will come.”   Very few large companies jump into something without support, they wait for some level of maturity to be developed and then wade in slowly.

Third -  The biggest reason is a lack of trust putting all your technology assets into one company.    Most companies had for years multi-data center strategies, using a combination of self-created, leverage multiple companies like  Wipro, IBM, HP, Digital Realty Trust, etc., and various co-location.   For big companies when the cloud became popular, it was how do I augment my existing environment with Cloud.    Now many companies are applying a Cloud First Strategy .    So why wouldn’t principles that were applied for decades in technology, be applied to the cloud.   Everyone remembers the saying, don’t put all your eggs in one basket.    I understand there are regions, multi-AZ, resiliency, and redundancy, but at the end of the day one cloud provider is one cloud provider, and all my technology eggs are in that one basket.

Fourth - The last reason is pricing.   If you can move your entire workload from Amazon to Google within minutes, it forces cloud vendors to keep costs low as cloud service charges for what you use.   I understand if you have a workload with petabytes of data, it’s not going to move.  But have web services with small data behind them, they can move and relatively quickly with the right deployment tools in place.

What do you think?   Leave me a comment with your feedback or ideas?

I believe that Multi-cloud environments are going to be the most important technology investment in 2018/2019.   This will drive education and new skill development among various technology workers.  Apparently, it’s not just me, IDC prediction is that “More than 85% of Enterprise IT Organizations Will Commit to Multicloud Architectures by 2018, Driving...