Archive of posts from 2019

AWS Certified Machine Learning – Specialty

Image of AWS Machine Learning Speciality Certification

I passed the AWS Certified Machine Learning Speciality Exam on Monday. That makes my 10th AWS certification in the last 18 months.

The Machine Learning Specialty certification is unlike any of the other exams from AWS. The exam doesn’t just focus on AWS specifics but covers a wide range of Machine Learning topics. The exam blueprint provides a basis of this coverage.

The exam is probably the hardest of the 10 I’ve taken to date. The entire exam, I thought I know the material, but I don’t think I know it well enough to pass the exam. My score was good, and it satisfying to add this certification. For the Machine Learning exam, I put in well over 200 hours over the last six months and over 80 hours the four weeks before sitting the exam. Definitely think the Big Data Certification helped on the data preparation sections.

They’re a bunch of links I will share later this week, which I studied. In addition to all the reading, I did acloud.guru’s AWS Certified Machine Learning - Speciality, which provides 40% of the material required to pass the exam. The rest of the exam requires detailed knowledge of Machine Learning. I followed the learning track recommended by AWS for Data Scientist. I also did several sections from Linux Academy Machine Learning, including the great section explaining PCA. Lastly, I took the AWS practice exam. I did look at Whizlabs but was somewhat disappointed in their practice tests.

In 2020, I hope to get a project which will allow me to leverage Machine Learning in SageMaker to solve a complex customer problem.

Image of AWS Machine Learning Speciality Certification

I passed the AWS Certified Machine Learning Speciality Exam on Monday. That makes my 10th AWS certification in the last 18 months.

The Machine Learning Specialty certification is unlike any of the other exams from AWS. The exam doesn’t...

What I Learned About GCP

I’ve been on AWS since February of 2009, and my first bill was for $1.21 for some S3 Storage. Recently, I wanted to understand the Google Cloud Platform, as people talk about Spanner, BigQuery, BigTable, and App Engine. I figured the best way to learn was to challenge myself with a Google certification exam.

Given all my AWS experience, I initially wanted to write a blog article about what I liked and disliked, but I don’t think it’s that simple. There are exciting things within AWS and Google. Both of the platforms are complex, so this by no means is exhaustive. It’s more of what I noticed in my first couple of logins to Google Cloud.

The first thing I noticed was outside the service names how familiar the services were, and it didn’t take much to understand the VPCs, IAM, Billing, monitoring, Kubernetes (GKE), and Storage. The service names are vastly different, where Google calls everything Cloud blah and AWS calls them AWS or Amazon blah. Most of the fundamental principles were the same, especially in primary services like Compute, Storage, and IAM. This terminology probably speaks more to multi-cloud, than anything else.

The second thing I found that the Google Cloud Shell in the browner was outstanding. Google Cloud Shell is a container running which gives you a fully functioning Linux shell with disk space. Cloud Shell can be used for files, configuration files like Kubernetes manifests, and to check out code repositories. The kicker is that it’s embedded into the service and is free. The closest thing AWS offers is the shell inside Cloud9 service, which comes with an added expense. The Cloud Shell is something I liked on GCP.

The third thing I noticed was this concept of projects, which is a folder construct. I’m not sure if I like it. I saw examples where people used seperate folders for dev, test, and Production in the same account. I would be a little concerned given how easy it would be to be in the wrong project and issue commands. I prefer my dev/test to be separate accounts from Production. So I don’t necessarily know if this is a good or bad thing, but trends toward dislike.

Next fourth thing I noticed was the firewall rules. AWS has both the concept of Security Groups and Firewalls (NACLS). GCP only has firewall rules. The rule structure is impressive, as it allows to target by service account, tags, IP addresses. I would have a concern in a larger environment that the Firewall Rule list would be overly complicated and difficult to read and manage. I much prefer smaller nested security groups on AWS. However, the flexible of the GCP Firewall is impressive. I want the concept of tags inside security groups within AWS. So firewall rules are something I liked.

The fifth thing I want to highlight is the instance configuration. While AWS offers fixed CPU and memory instances, GCP offers custom selections for memory and CPU. This could be very interesting if there are a low CPU and high memory workload. I didn’t see significant cost differences between an overprovisioned AWS resource vs. a custom GCP resource. However, I also didn’t do an in-depth, TCO analysis. Again, I see pros and cons to this and probably I am neutral on this subject.

The last thing is the UI. It is different from AWS, and it took some use getting used too. It’s very similar in my experience to the G-Suite Admin or other Google services. I found the configuration of computing to be more changing given it’s a single page with tabs, vs. the AWS workflow. However, other items like Storage seemed to be more friendly. It doesn’t make a lousy user experience. Again I am neutral on this topic, I learned how to use it.

Probably now you are reading this and looking for that summary or in conclusion section. I’m not going to provide it. I remember two decades ago when we wanted to stand up web servers in a data center for a project, and it was going to cost $5,000 before we wrote the first line of code. As struggling college students, this wasn’t going to happen. What I am going to say is to go build something. Its never been easier for a builder to make an idea come to life on a platform you prefer with minimum investment (free tier). If your game is running Cobol inside a Kubernetes container, go do it. If you hate infrastructure go Serverless. Cobol on serverless would me attractive, eh? The power is in your hands. If you don’t have any ideas, go get a cloud certification. There never been a better time for a technologist with cloud experience.

I’ve been on AWS since February of 2009, and my first bill was for $1.21 for some S3 Storage. Recently, I wanted to understand the Google Cloud Platform, as people talk about Spanner, BigQuery, BigTable, and App Engine. I figured the best way to learn was to challenge myself with...

Passed Google Associate Cloud Engineer

I passed the Google ACE Exam. The course while it doesn’t provide all the content covered on the exam, it points out all the topics which are required to pass the exam. Before studying for this exam, I had limited GCP experience but extensive AWS experience.

In addition to what is covered in the Acloud.guru course, I found these following topics extremely helpful.

https://cloud.google.com/docs/compare/aws/

IAM

https://cloud.google.com/iam/docs/service-account https://cloud.google.com/compute/docs/access/service-accounts#compute_engine_default_service_account https://cloud.google.com/iam/docs/understanding-roles https://cloud.google.com/iam/docs/understanding-roles#primitive_roles https://cloud.google.com/iam/reference/rest/v1/Policy

Compute

https://cloud.google.com/sdk/gcloud/reference/config/set https://cloud.google.com/compute/docs/startupscript https://cloud.google.com/compute/docs/storing-retrieving-metadata https://cloud.google.com/compute/docs/machine-types https://cloud.google.com/compute/docs/disks/scheduled-snapshots https://cloud.google.com/compute/docs/instance-groups/#autohealing

Storage

https://cloud.google.com/storage/docs/storage-classes

Analytics

https://cloud.google.com/bigtable/ https://cloud.google.com/billing/docs/how-to/export-data-file https://cloud.google.com/billing/docs/how-to/export-data-bigquery

App Engine

https://cloud.google.com/sdk/gcloud/reference/app/deploy https://cloud.google.com/sdk/gcloud/reference/deployment-manager/deployments/list https://cloud.google.com/appengine/docs/standard/php/an-overview-of-app-engine#limits

Networking

https://cloud.google.com/vpc/docs/using-vpc https://cloud.google.com/vpc/docs/firewalls https://cloud.google.com/compute/docs/ip-addresses/ https://cloud.google.com/load-balancing/ https://cloud.google.com/load-balancing/docs/choosing-load-balancer https://cloud.google.com/router/docs/

Kubernetes

https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/ https://cloud.google.com/kubernetes-engine/docs/concepts/statefulset https://cloud.google.com/kubernetes-engine/docs/concepts/pod https://cloud.google.com/kubernetes-engine/docs/concepts/daemonset https://cloud.google.com/sdk/gcloud/reference/container/clusters/create https://cloud.google.com/sdk/gcloud/reference/container/clusters/resize https://cloud.google.com/kubernetes-engine/docs/quickstart https://kubernetes.io/docs/tutorials/kubernetes-basics/explore/explore-intro/ https://cloud.google.com/kubernetes-engine/quotas https://cloud.google.com/kubernetes-engine/docs/troubleshooting

Billing

https://cloud.google.com/billing/docs/how-to/budgets

DB

https://cloud.google.com/sql/ https://cloud.google.com/sql/docs/mysql/backup-recovery/restore https://dev.mysql.com/doc/refman/8.0/en/binary-log.html https://cloud.google.com/db-migration/ https://cloud.google.com/spanner/ https://cloud.google.com/datastore/

Functions

https://cloud.google.com/functions/docs/concepts/overview

Stackdriver

https://cloud.google.com/error-reporting/ https://cloud.google.com/logging/ https://cloud.google.com/profiler/ https://cloud.google.com/debugger/ https://cloud.google.com/trace/ https://cloud.google.com/logging/docs/audit/

Several people in the forums and the Internet have made comments comparing the GCP ACE to AWS. I found that difficult of the exam compares to the AWS Solution Architect Associate combined with the AWS SysOps Associate exam.

Thank you Mattias Anderson for putting together an excellent course on acloud guru.

I am thinking about pursuing the Google Cloud Professional Architect, before diving into some other certifications.

I passed the Google ACE Exam. The course while it doesn’t provide all the content covered on the exam, it points out all the topics which are required to pass the exam. Before studying for this exam, I had limited GCP experience but extensive AWS experience.

In addition to what...

Big Data Certification

Image of AWS Big Data Speciality Certification

I passed the AWS Certified Big Data Speciality Exam on Saturday. That makes my 9th AWS certification in the last 10 months. For a moment I’ll have 9/9 certifications. Machine Learning opens this month, so come tomorrow I’ll have 9/10 Certifications. Machine learning recommended training is Big Data on AWS and Deep Learning on AWS. Given I just completed Big Data, probably schedule this exam for sometime in May.

Big Data Certification Exam is similar to the other specialty exams. While not necessarily as hard as the Professional level exams it does require a detailed level of knowledge. Also unlike the other specialty exams, Big Data requires a breadth and depth of knowledge consistent with the Professional Level exams. I prepared using acloud.guru’s AWS Certified Big Data - Speciality which provides somewhere between 50% - 60% of the required topics around Kinesis, IoT, S3, DynamoDB, EMR, Redshift, and Quicksight. I did review some topics in Linux Academy to reinforce the concepts. The rest of the experience is hands-on or lab learnings. AWS doesn’t offer a practice exam, so I tried the Whizlab practice exams. Whizlab’s typically have issues and provide a false level of confidence as the practice exams are always easier than the actual certification exam.

Acloud.guru covers much information, and it also provides a set of links to critical whitepapers and blog articles. As always without, violating the NDA, they do an excellent job in pointing you to the topics to study. Aside from that material, I read a whole bunch of AWS links, which will be posted at the end of this blog article. Also, there was a great youtube playlist John Creecy put together at https://www.youtube.com/playlist?list=PLlp-qT09uTBcoMpiQkpO-G8GsHOVWyfV0.

I am relatively little experience with Kinesis, EMR, Redshift, and Quicksight, before studying for the exam. I found Kinesis, Redshift, and Elasticsearch fascinating, and will be looking for projects in this space to continue my learning.

Kinesis
https://docs.aws.amazon.com/streams/latest/dev/key-concepts.html https://docs.aws.amazon.com/streams/latest/dev/introduction-to-enhanced-consumers.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-ddb.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-split.html https://docs.aws.amazon.com/streams/latest/dev/developing-producers-with-kpl.html https://docs.aws.amazon.com/streams/latest/dev/building-consumers.html https://docs.aws.amazon.com/streams/latest/dev/creating-using-sse-master-keys.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-producer-adv-retries-rate-limiting.html https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-kcl.html https://docs.aws.amazon.com/streams/latest/dev/agent-health.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-merge.html

Kinesis Firehose
https://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html#data-flow-diagrams https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html https://docs.aws.amazon.com/firehose/latest/dev/create-configure.html https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html#lambda-blueprints https://docs.aws.amazon.com/firehose/latest/dev/encryption.html

Kinesis Data Analytics
https://docs.aws.amazon.com/kinesisanalytics/latest/dev/what-is.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/streams-pumps.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/authentication-and-access-control.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/stagger-window-concepts.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/tumbling-window-concepts.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/sliding-window-concepts.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/continuous-queries-concepts.html

IoT
https://docs.aws.amazon.com/iot/latest/developerguide/what-is-aws-iot.html https://docs.aws.amazon.com/iot/latest/developerguide/policy-actions.html https://docs.aws.amazon.com/iot/latest/developerguide/iam-policies.html https://docs.aws.amazon.com/iot/latest/developerguide/iot-provision.html https://docs.aws.amazon.com/iot/latest/developerguide/iot-device-shadows.html https://docs.aws.amazon.com/iot/latest/developerguide/iot-rule-actions.html

ElasticSearch
https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/what-is-amazon-elasticsearch-service.html https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/aes-bp.html https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-aws-integrations.html

CloudSearch
https://docs.aws.amazon.com/cloudsearch/latest/developerguide/what-is-cloudsearch.html

EMR
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-overview.html#emr-overview-clusters https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-file-systems.html https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-consistent-view.html https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-encryption-enable.html#emr-awskms-keys https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-data-encryption-options.html https://docs.aws.amazon.com/emr/latest/ManagementGuide/emrfs-configure-sqs-cw.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-flink.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-tez.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hbase.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hcatalog.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-zookeeper.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-phoenix.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-sqoop.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-presto.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-jupyter-emr-managed-notebooks.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-jupyterhub.html

QuickSight
https://docs.aws.amazon.com/quicksight/latest/user/welcome.html https://docs.aws.amazon.com/quicksight/latest/user/refreshing-imported-data.html https://docs.aws.amazon.com/quicksight/latest/user/joining-tables.html https://docs.aws.amazon.com/quicksight/latest/user/bar-charts.html https://docs.aws.amazon.com/quicksight/latest/user/combo-charts.html https://docs.aws.amazon.com/quicksight/latest/user/heat-map.html https://docs.aws.amazon.com/quicksight/latest/user/line-charts.html https://docs.aws.amazon.com/quicksight/latest/user/kpi.html https://docs.aws.amazon.com/quicksight/latest/user/restrict-access-to-a-data-set-using-row-level-security.html#create-row-level-security https://docs.aws.amazon.com/quicksight/latest/user/tabular.html https://docs.aws.amazon.com/quicksight/latest/user/supported-data-sources.html https://docs.aws.amazon.com/quicksight/latest/user/scatter-plot.html https://docs.aws.amazon.com/quicksight/latest/user/geospatial-data-prep.html

Redshift
https://docs.aws.amazon.com/redshift/latest/dg/tutorial-tuning-tables-distribution.html https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-best-dist-key.html https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-about-clusters-and-nodes https://docs.aws.amazon.com/redshift/latest/mgmt/enhanced-vpc-working-with-endpoints.html https://docs.aws.amazon.com/redshift/latest/dg/c_designing-queries-best-practices.html https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-use-copy.html https://docs.aws.amazon.com/redshift/latest/dg/c_intro_STL_tables.html https://docs.aws.amazon.com/redshift/latest/dg/c_intro_STV_tables.html https://docs.aws.amazon.com/redshift/latest/dg/cm-c-implementing-workload-management.html https://docs.aws.amazon.com/redshift/latest/dg/wlm-short-query-acceleration.html

DynamoDB
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-design.html#bp-partition-key-partitions-adaptive https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/globaltables_monitoring.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-data-upload.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/globaltables_reqs_bestpractices.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-gsi-aggregation.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-gsi-overloading.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-indexes-gsi-sharding.html

Machine Learning
https://docs.aws.amazon.com/machine-learning/latest/dg/types-of-ml-models.html https://docs.aws.amazon.com/machine-learning/latest/dg/binary-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/regression-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/multiclass-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/ml-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/cross-validation.html https://docs.aws.amazon.com/machine-learning/latest/dg/creating-and-using-datasources.html https://docs.aws.amazon.com/machine-learning/latest/dg/creating-a-data-schema-for-amazon-ml.html https://docs.aws.amazon.com/machine-learning/latest/dg/amazon-machine-learning-key-concepts.html

Pipeline
https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-how-tasks-scheduled.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-concepts-datanodes.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-concepts-databases.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-importexport-ddb-part1.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/datapipeline-related-services.html

Data Movement
https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/CHAP_Welcome.html

Athena
https://docs.aws.amazon.com/athena/latest/ug/access.html https://docs.aws.amazon.com/athena/latest/ug/encryption.html#encryption-options-S3-and-Athena https://docs.aws.amazon.com/athena/latest/ug/athena-aws-service-integrations.html

Glue
https://docs.aws.amazon.com/glue/latest/dg/components-overview.html

Image of AWS Big Data Speciality Certification

I passed the AWS Certified Big Data Speciality Exam on Saturday. That makes my 9th AWS certification in the last 10 months. For a moment I’ll have 9/9 certifications. Machine Learning opens this month, so come tomorrow I’ll have 9/10...

Advanced Architecting on AWS

I took Advanced Architecting on AWS for the last three days. The course is part of the learning process for the AWS Certified Solutions Architect – Professional. I already have the certification based on the older version of the exam. The new version of the certification exam went live on February 4th. The course seems to follow the newer certification guide. Overall the course is good as it covers all the services required, the labs were a little disappointing as they lacked complexity. To become proficient and attempt the certification, one would need to a lot more learning and deep diving on the topics covered in this course. It reviews probably 35% of the material required to sit the exam.

Here is my summary by day of the course.

Day One

The morning was spent covering Account Management and multiple accounts, leading to AWS Organizations with service control policies. It finished on billing. The next two discussions where around Advanced Networking Architectures, then VPN and DirectConnect. The afternoon finished with a discussion on Deployments on AWS which was an abbreviation of material covered in the DevOps Course.

Day Two

The morning started with data specifically discussing S3 and Elasticache. Next, it was all about data import into AWS with Snowball, Snowmobile, S3 Transfer Acceleration, Storage Gateways(Tape Gateway, Volume Gateway, and File Gateway), and fished with Data Sync, and Database Migration,

The afternoon was spent on Big Data Architecture and Designing Large Scale Applications and finished with a lab on Blue-Green Deployments on Elastic BeanStalk.

Day Three

The last day was spent on Building Resilient Architectures, and encryption and Data Security. The day ended early with a Lab on KMS. The lab provided some basic KMS and OpenSSL encryption steps.

I thought the course, missed an opportunity to talk about DR architectures.

It’s an interesting course and worth taking if you’re interested in learning more or planning to take the certifications.

I took Advanced Architecting on AWS for the last three days. The course is part of the learning process for the AWS Certified Solutions Architect – Professional. I already have the certification based on the older version of the exam. The new version of the certification exam went...

Violating Security Policies

Dark Reading wrote a Blog Architect entitled 6 Reasons Why Employees Violate Security Policies The 6 reasons according to the article are:

  1. Ignorance
  2. Convenience
  3. Frustration
  4. Ambition
  5. Curiosity
  6. Helpfulness

I think they’re neglecting to get to the root of the issue which is draconian security policies which don’t make things more secure. Over the years, I’ve seen similar policies coming from InfoSec groups. It’s common for developers to want to use the tools they’re comfortable with, in an extreme case I’ve seen developers wanting to use Eclipse to do development and Eclipse is forbidden because the only safe editor according to some InfoSec policy is VI (probably slightly exaggerated). Other extreme cases include banning of Evernote or OneNote because it uses cloud storage. I’m assuming in this that someone is not putting all there confidential customer data in a OneNote book.

Given what I’ve seen, employee violates security policies to get work done, the way they want to do it. Maybe that ignorance, convenience, frustration, ambition, or any other topic, or maybe if you’ve used something for 10 years, you don’t want to have to learn something new for development or keeping notes, given there are many other things to learn and do which add value to their job and employer.

Maybe to keep employees from violating InfoSec policies, InfoSec groups instead of writing draconian security policies could focus on identifying security vulnerabilities which are more likely targets of hackers, putting policies, procedures and operational security around them. Lastly, InfoSec could spend time educating what confidential data is and where it is allowed to stored.

Disclaimer: This blog article is not meant to condone, encourage, or motivate people to violate security policies.

Dark Reading wrote a Blog Architect entitled 6 Reasons Why Employees Violate Security Policies The 6 reasons according to the article are:

  1. Ignorance
  2. Convenience
  3. Frustration
  4. Ambition
  5. Curiosity
  6. Helpfulness

I think they’re neglecting to get to the root of the issue which is draconian security policies which don’t...

What is the difference between a CDO, CTO and a CIO?

I got into an interesting discussion on what is the difference between a CDO, CTO, and CIO. The initial discussion started with are all those positions required in an organization. The group eventually agreed the answer was yes. The logic was given everything we do is digital, digital needs multiple seats at the executive table. The reason for this blog article is where do these roles fit within an organization. Let’s take a step back and explained how we defined the roles.

CDO should own e-commerce, mobile environments, and technology customer outreach. In a digital product company, they own the product roadmap. The CDO is responsible for all digital customer touch points. The technology partner for the CDO is the CMO or SVP of Sales. This role should be driving the business, and be a business enabler.

CIO should own the back office technology like email, ERP, messaging, desktops, laptops, printers, networking, service desks, and traditional data centers. Typically technology organization which is the cost centers.

CTO should own the architecture and technology of the platforms. CTO is the technology partner for both the CDO and CIO. Their job should be to have uniformity, coalesce ideas across technology and work with the various stakeholders to ensure proper architecture governance (think TOGAF architecture review boards).

The group when discussing it was pretty emphatic, the CDO should report to the CEO. Now, this is where the issue with the outstanding roles breaks down. The role defined for the CIO is an operational role, making sure essential infrastructure services and users can function. The group was split 50/50, and half the group thought the CIO should report to the CDO, the other half said some other C-level executive, like the CFO or COO.

The more complicated issue is where does the CTO report. The CTO is responsible for the architecture and technology of the platform which makes them a partner of the CDO, but also owns architecture review which makes them a partner of the CIO. So where does the CTO report?

The CDO has an entirely different objective than the CIO. If the CIO reports to the CDO, it would make sense to have the CTO report there. However, what happens when the CIO doesn’t report to the CDO. What happens if the CIO reports to the COO?

After several rounds of mental gymnastics, the group agreed to coalesce around two outcomes. First, the CIO either reports to the CDO, and the CTO reports to the CDO. Basically, CTO and CIO become peers in the same organization. The other was the CIO reports to the CTO and both the CTO and CDO report to the CEO.

I got into an interesting discussion on what is the difference between a CDO, CTO, and CIO. The initial discussion started with are all those positions required in an organization. The group eventually agreed the answer was yes. The logic was given everything we do is digital, digital needs multiple...

Using Athena to Query ALB Logs

One of the more interesting AWS Big Data Services is Amazon Athena. Athena can process S3 data in a few seconds. One of the ways I like using it is to look for patterns in ALB access logs.

AWS provides a detailed instruction on how to setup Athena on how to setup ALB access logs. I’m not going to recap the configuration in this blog article, but share 3 of my favorite queries.

What is the most visited page by the client and total traffic on my website:

SELECT sum(received_bytes) as total_received, sum(sent_bytes) as total_sent, client_ip, 
count(client_ip) as client_requests, request_url  
FROM alb_logs 
GROUP BY client_ip, request_url  
ORDER BY total_sent  desc;

How long does it take to process requests on average?

SELECT sum(request_processing_time) as request_pt, sum(target_processing_time) as target_pt,
sum (response_processing_time) respone_pt, 
sum(request_processing_time + target_processing_time + response_processing_time) as total_pt, 
count(request_processing_time) as total_requests,
sum(request_processing_time + target_processing_time + response_processing_time) / count(request_processing_time) as avg_pt,
request_url, target_ip
FROM alb_logs WHERE target_ip <> ''
GROUP BY request_url, target_ip 
HAVING COUNT (request_processing_time) > 4 
ORDER BY avg_pt desc;

This last one is looking for requests the site doesn’t process. It’s usually some person trying to find some vulnerable PHP code.

SELECT count(client_ip) as client_requests, client_ip, target_ip, request_url, 
target_status_code 
FROM alb_logs 
WHERE target_status_code not in ('200','301','302','304') 
GROUP BY client_ip, target_ip, request_url, target_status_code
ORDER BY client_requests desc; 

Athena is a serverless tool, and it sets up in seconds and the charges based on TB scanned with a 10MB minimum for the query.

One of the more interesting AWS Big Data Services is Amazon Athena. Athena can process S3 data in a few seconds. One of the ways I like using it is to look for patterns in ALB access logs.

AWS provides a detailed instruction on how to setup Athena on...

DevOps Engineering on AWS

I took DevOps Engineering on AWS for the last three days. The course is part of the learning process for the AWS Certified DevOps Engineer – Professional Overall the course is excellent it covers substantial material, and the labs are ok. To become proficient, one should do the labs from scratch and build the CloudFormation templates. It reviews 45-50% of the material for the on the DevOps Exam, so each topic requires a deeper dive before sitting the exam.

Here is my summary by day of the course.

Day One

The class started with an introduction to DevOps and the AWS tools which support Devops:

It’s interesting as CodeBuild, CodeDeploy, and CodePipeline are required to replace Jenkins. Their advantage is that it directly integrate with AWS. One question I have is why isn’t there a service like Jfrog Artifactory

One of my favorite topics was DevSecOps which talks about adding security into the DevOps process. There should be a separate certification and course for DevSecOps or SecDevOps.

There was a minimum discussion on Elastic Beanstalk, which was a big part of the old acloud.guru course and had several questions on the old exam.

Lastly, the day focused on various methods for updating applications. In-place updates Rolling updates Blue/Green Deployments Red/Black Deployments

Day Two

The class started with a lab on CloudFormation. The lab was flawed as it had a code deployment via the cfn-init and cfn-hup. The rest of the morning was a deeper dive on the tools discussed throughout Day 1.

Afternoon lab focused on a pipeline, CodeBuild, and CodeDeploy. After the lab, we spent time discussing various testing, CloudWatch Logs, and Opsworks. Most of the discussion was theoretical.

Day Three

The first part of the morning was a 2-hour lab on AWS Opsworks setting up a Chef recipe and scaling out the environment. The rest of the class was devoted to containers, primary ECS, with a lab that deployed an application on containers.

It’s an interesting course and worth taking if you’re doing AWS DevOps or planning to take the certifications.

I took DevOps Engineering on AWS for the last three days. The course is part of the learning process for the AWS Certified DevOps Engineer – Professional Overall the course is excellent it covers substantial material, and the labs are ok. To become proficient, one should do the...

Goodnotes 5

Goodnotes 5 was released last week. Goodnotes is my favorite stylus note taking app on the IPad. I’ve tried most of the competitors at least once and revisit them when they release new features. I’ve been on Goodnotes for years and have been using it daily.

Let’s move to the topic of this blog, Goodnotes 5 is a bit buggy. There were a ton of negative comments on Twitter towards the release. The development team has released 7 updates as of the writing of this blog. Goodnotes 5 is not a forced upgrade from version 4. While I’ve not seen all the problems described on Twitter, I’ve seen a few of the issues. I knew installing the initial release, and there were going to be some bugs.

However if you think about the DevOps model release, fix, release, fix, release, fix. The model is built for this type of release and user feedback.

However, many of the twitter complaints, where why was buggy software released. So it made me think about when is software ready for release in the DevOps model? Typically there is a release once code passes, unit tests, integration tests, load tests, functional tests, and GUI Tests. However bugs do reach production and the users, there is no fool-proof plan.

App store doesn’t allow releasing of beta software. However does offer TestFlight, so maybe GoodNotes could have leverage 10,000 of its customers to beta test the software and avoided the negative backlash on Twitter.

Goodnotes 5 was released last week. Goodnotes is my favorite stylus note taking app on the IPad. I’ve tried most of the competitors at least once and revisit them when they release new features. I’ve been on Goodnotes for years and have been using it daily.

Let’s move...

Jekyll

Decided to try an switch from Wordpress to Jekyll. While Wordpress provides a ton of features, the interface for creating blog entries is overly burdensome. Also, Wordpress continues to announce security vulnerabilities. Jekyll uses a markdown file which provides for a pure editing experience. However, most anything in Jekyll requires modification of layout and include HTML files. Jekyll uses some Liquid to provide development capabilities inside the HTML files. Jekyll combines the templates with the markdown into static HTML files.

My two favorite things are that Jekyll can be run on a local workstation so you can preview changes and everything can check into a Git Repository

This is the initial release, aside from some minor issues, content is showing up. Next release of this blog will include comments and search functionality.

Decided to try an switch from Wordpress to Jekyll. While Wordpress provides a ton of features, the interface for creating blog entries is overly burdensome. Also, Wordpress continues to announce security vulnerabilities. Jekyll uses a markdown file which provides for a pure editing experience. However, most anything in...

Cloud Practioner

Passed the AWS Cloud Partitioner Certification Exam. Given I have 7 of the 9 certifications before sitting this exam, I didn’t study. The goal before taking the exam was 100% in 20 minutes. I missed 3 questions and took 16 minutes. I took the exam at some point I am going to complete the Big Data Speciality, which will give me all the AWS certifications for a brief moment. The Machine Learning AI beta completed last month and the Alexa Skill Builder just completed its beta. This means by March there could be 10 or 11 AWS Certifications.

Passed the AWS Cloud Partitioner Certification Exam. Given I have 7 of the 9 certifications before sitting this exam, I didn’t study. The goal before taking the exam was 100% in 20 minutes. I missed 3 questions and took 16 minutes. I took the exam at some point I...

DevOps Pro Links

I posted to Github a list of links I found valuble when studying for the AWS DevOps Pro certification exam.

The original blog article about passing the test can be found here AWS Certified DevOps Engineer - Professional

I posted to Github a list of links I found valuble when studying for the AWS DevOps Pro certification exam.

The original blog article about passing the test can be found here AWS Certified DevOps Engineer - Professional