Big Data Certification

Image of AWS Big Data Speciality Certification

I passed the AWS Certified Big Data Speciality Exam on Saturday. That makes my 9th AWS certification in the last 10 months. For a moment I’ll have 9/9 certifications. Machine Learning opens this month, so come tomorrow I’ll have 9/10 Certifications. Machine learning recommended training is Big Data on AWS and Deep Learning on AWS. Given I just completed Big Data, probably schedule this exam for sometime in May.

Big Data Certification Exam is similar to the other specialty exams. While not necessarily as hard as the Professional level exams it does require a detailed level of knowledge. Also unlike the other specialty exams, Big Data requires a breadth and depth of knowledge consistent with the Professional Level exams. I prepared using acloud.guru’s AWS Certified Big Data - Speciality which provides somewhere between 50% - 60% of the required topics around Kinesis, IoT, S3, DynamoDB, EMR, Redshift, and Quicksight. I did review some topics in Linux Academy to reinforce the concepts. The rest of the experience is hands-on or lab learnings. AWS doesn’t offer a practice exam, so I tried the Whizlab practice exams. Whizlab’s typically have issues and provide a false level of confidence as the practice exams are always easier than the actual certification exam.

Acloud.guru covers much information, and it also provides a set of links to critical whitepapers and blog articles. As always without, violating the NDA, they do an excellent job in pointing you to the topics to study. Aside from that material, I read a whole bunch of AWS links, which will be posted at the end of this blog article. Also, there was a great youtube playlist John Creecy put together at https://www.youtube.com/playlist?list=PLlp-qT09uTBcoMpiQkpO-G8GsHOVWyfV0.

I am relatively little experience with Kinesis, EMR, Redshift, and Quicksight, before studying for the exam. I found Kinesis, Redshift, and Elasticsearch fascinating, and will be looking for projects in this space to continue my learning.

Kinesis
https://docs.aws.amazon.com/streams/latest/dev/key-concepts.html https://docs.aws.amazon.com/streams/latest/dev/introduction-to-enhanced-consumers.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-ddb.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-split.html https://docs.aws.amazon.com/streams/latest/dev/developing-producers-with-kpl.html https://docs.aws.amazon.com/streams/latest/dev/building-consumers.html https://docs.aws.amazon.com/streams/latest/dev/creating-using-sse-master-keys.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-producer-adv-retries-rate-limiting.html https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-kcl.html https://docs.aws.amazon.com/streams/latest/dev/agent-health.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-merge.html

Kinesis Firehose
https://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html#data-flow-diagrams https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html https://docs.aws.amazon.com/firehose/latest/dev/create-configure.html https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html#lambda-blueprints https://docs.aws.amazon.com/firehose/latest/dev/encryption.html

Kinesis Data Analytics
https://docs.aws.amazon.com/kinesisanalytics/latest/dev/what-is.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/streams-pumps.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/authentication-and-access-control.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/stagger-window-concepts.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/tumbling-window-concepts.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/sliding-window-concepts.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/continuous-queries-concepts.html

IoT
https://docs.aws.amazon.com/iot/latest/developerguide/what-is-aws-iot.html https://docs.aws.amazon.com/iot/latest/developerguide/policy-actions.html https://docs.aws.amazon.com/iot/latest/developerguide/iam-policies.html https://docs.aws.amazon.com/iot/latest/developerguide/iot-provision.html https://docs.aws.amazon.com/iot/latest/developerguide/iot-device-shadows.html https://docs.aws.amazon.com/iot/latest/developerguide/iot-rule-actions.html

ElasticSearch
https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/what-is-amazon-elasticsearch-service.html https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/aes-bp.html https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-aws-integrations.html

CloudSearch
https://docs.aws.amazon.com/cloudsearch/latest/developerguide/what-is-cloudsearch.html

EMR
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-overview.html#emr-overview-clusters https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-file-systems.html https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-consistent-view.html https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-encryption-enable.html#emr-awskms-keys https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-data-encryption-options.html https://docs.aws.amazon.com/emr/latest/ManagementGuide/emrfs-configure-sqs-cw.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-flink.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-tez.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hbase.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hcatalog.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-zookeeper.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-phoenix.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-sqoop.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-presto.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-jupyter-emr-managed-notebooks.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-jupyterhub.html

QuickSight
https://docs.aws.amazon.com/quicksight/latest/user/welcome.html https://docs.aws.amazon.com/quicksight/latest/user/refreshing-imported-data.html https://docs.aws.amazon.com/quicksight/latest/user/joining-tables.html https://docs.aws.amazon.com/quicksight/latest/user/bar-charts.html https://docs.aws.amazon.com/quicksight/latest/user/combo-charts.html https://docs.aws.amazon.com/quicksight/latest/user/heat-map.html https://docs.aws.amazon.com/quicksight/latest/user/line-charts.html https://docs.aws.amazon.com/quicksight/latest/user/kpi.html https://docs.aws.amazon.com/quicksight/latest/user/restrict-access-to-a-data-set-using-row-level-security.html#create-row-level-security https://docs.aws.amazon.com/quicksight/latest/user/tabular.html https://docs.aws.amazon.com/quicksight/latest/user/supported-data-sources.html https://docs.aws.amazon.com/quicksight/latest/user/scatter-plot.html https://docs.aws.amazon.com/quicksight/latest/user/geospatial-data-prep.html

Redshift
https://docs.aws.amazon.com/redshift/latest/dg/tutorial-tuning-tables-distribution.html https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-best-dist-key.html https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-about-clusters-and-nodes https://docs.aws.amazon.com/redshift/latest/mgmt/enhanced-vpc-working-with-endpoints.html https://docs.aws.amazon.com/redshift/latest/dg/c_designing-queries-best-practices.html https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-use-copy.html https://docs.aws.amazon.com/redshift/latest/dg/c_intro_STL_tables.html https://docs.aws.amazon.com/redshift/latest/dg/c_intro_STV_tables.html https://docs.aws.amazon.com/redshift/latest/dg/cm-c-implementing-workload-management.html https://docs.aws.amazon.com/redshift/latest/dg/wlm-short-query-acceleration.html

DynamoDB
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-design.html#bp-partition-key-partitions-adaptive https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/globaltables_monitoring.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-data-upload.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/globaltables_reqs_bestpractices.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-gsi-aggregation.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-gsi-overloading.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-indexes-gsi-sharding.html

Machine Learning
https://docs.aws.amazon.com/machine-learning/latest/dg/types-of-ml-models.html https://docs.aws.amazon.com/machine-learning/latest/dg/binary-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/regression-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/multiclass-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/ml-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/cross-validation.html https://docs.aws.amazon.com/machine-learning/latest/dg/creating-and-using-datasources.html https://docs.aws.amazon.com/machine-learning/latest/dg/creating-a-data-schema-for-amazon-ml.html https://docs.aws.amazon.com/machine-learning/latest/dg/amazon-machine-learning-key-concepts.html

Pipeline
https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-how-tasks-scheduled.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-concepts-datanodes.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-concepts-databases.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-importexport-ddb-part1.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/datapipeline-related-services.html

Data Movement
https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/CHAP_Welcome.html

Athena
https://docs.aws.amazon.com/athena/latest/ug/access.html https://docs.aws.amazon.com/athena/latest/ug/encryption.html#encryption-options-S3-and-Athena https://docs.aws.amazon.com/athena/latest/ug/athena-aws-service-integrations.html

Glue
https://docs.aws.amazon.com/glue/latest/dg/components-overview.html

Image of AWS Big Data Speciality Certification

I passed the AWS Certified Big Data Speciality Exam on Saturday. That makes my 9th AWS certification in the last 10 months. For a moment I’ll have 9/9 certifications. Machine Learning opens this month, so come tomorrow I’ll have 9/10 Certifications. Machine learning recommended training is Big Data on AWS and Deep Learning on AWS. Given I just completed Big Data, probably schedule this exam for sometime in May.

Big Data Certification Exam is similar to the other specialty exams. While not necessarily as hard as the Professional level exams it does require a detailed level of knowledge. Also unlike the other specialty exams, Big Data requires a breadth and depth of knowledge consistent with the Professional Level exams. I prepared using acloud.guru’s AWS Certified Big Data - Speciality which provides somewhere between 50% - 60% of the required topics around Kinesis, IoT, S3, DynamoDB, EMR, Redshift, and Quicksight. I did review some topics in Linux Academy to reinforce the concepts. The rest of the experience is hands-on or lab learnings. AWS doesn’t offer a practice exam, so I tried the Whizlab practice exams. Whizlab’s typically have issues and provide a false level of confidence as the practice exams are always easier than the actual certification exam.

Acloud.guru covers much information, and it also provides a set of links to critical whitepapers and blog articles. As always without, violating the NDA, they do an excellent job in pointing you to the topics to study. Aside from that material, I read a whole bunch of AWS links, which will be posted at the end of this blog article. Also, there was a great youtube playlist John Creecy put together at https://www.youtube.com/playlist?list=PLlp-qT09uTBcoMpiQkpO-G8GsHOVWyfV0.

I am relatively little experience with Kinesis, EMR, Redshift, and Quicksight, before studying for the exam. I found Kinesis, Redshift, and Elasticsearch fascinating, and will be looking for projects in this space to continue my learning.

Kinesis
https://docs.aws.amazon.com/streams/latest/dev/key-concepts.html https://docs.aws.amazon.com/streams/latest/dev/introduction-to-enhanced-consumers.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-ddb.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-split.html https://docs.aws.amazon.com/streams/latest/dev/developing-producers-with-kpl.html https://docs.aws.amazon.com/streams/latest/dev/building-consumers.html https://docs.aws.amazon.com/streams/latest/dev/creating-using-sse-master-keys.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-producer-adv-retries-rate-limiting.html https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-kcl.html https://docs.aws.amazon.com/streams/latest/dev/agent-health.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-merge.html

Kinesis Firehose
https://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html#data-flow-diagrams https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html https://docs.aws.amazon.com/firehose/latest/dev/create-configure.html https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html#lambda-blueprints https://docs.aws.amazon.com/firehose/latest/dev/encryption.html

Kinesis Data Analytics
https://docs.aws.amazon.com/kinesisanalytics/latest/dev/what-is.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/streams-pumps.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/authentication-and-access-control.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/stagger-window-concepts.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/tumbling-window-concepts.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/sliding-window-concepts.html https://docs.aws.amazon.com/kinesisanalytics/latest/dev/continuous-queries-concepts.html

IoT
https://docs.aws.amazon.com/iot/latest/developerguide/what-is-aws-iot.html https://docs.aws.amazon.com/iot/latest/developerguide/policy-actions.html https://docs.aws.amazon.com/iot/latest/developerguide/iam-policies.html https://docs.aws.amazon.com/iot/latest/developerguide/iot-provision.html https://docs.aws.amazon.com/iot/latest/developerguide/iot-device-shadows.html https://docs.aws.amazon.com/iot/latest/developerguide/iot-rule-actions.html

ElasticSearch
https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/what-is-amazon-elasticsearch-service.html https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/aes-bp.html https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-aws-integrations.html

CloudSearch
https://docs.aws.amazon.com/cloudsearch/latest/developerguide/what-is-cloudsearch.html

EMR
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-overview.html#emr-overview-clusters https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-file-systems.html https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-consistent-view.html https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-encryption-enable.html#emr-awskms-keys https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-data-encryption-options.html https://docs.aws.amazon.com/emr/latest/ManagementGuide/emrfs-configure-sqs-cw.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-flink.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-tez.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hbase.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hcatalog.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-zookeeper.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-phoenix.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-sqoop.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-presto.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-jupyter-emr-managed-notebooks.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-jupyterhub.html

QuickSight
https://docs.aws.amazon.com/quicksight/latest/user/welcome.html https://docs.aws.amazon.com/quicksight/latest/user/refreshing-imported-data.html https://docs.aws.amazon.com/quicksight/latest/user/joining-tables.html https://docs.aws.amazon.com/quicksight/latest/user/bar-charts.html https://docs.aws.amazon.com/quicksight/latest/user/combo-charts.html https://docs.aws.amazon.com/quicksight/latest/user/heat-map.html https://docs.aws.amazon.com/quicksight/latest/user/line-charts.html https://docs.aws.amazon.com/quicksight/latest/user/kpi.html https://docs.aws.amazon.com/quicksight/latest/user/restrict-access-to-a-data-set-using-row-level-security.html#create-row-level-security https://docs.aws.amazon.com/quicksight/latest/user/tabular.html https://docs.aws.amazon.com/quicksight/latest/user/supported-data-sources.html https://docs.aws.amazon.com/quicksight/latest/user/scatter-plot.html https://docs.aws.amazon.com/quicksight/latest/user/geospatial-data-prep.html

Redshift
https://docs.aws.amazon.com/redshift/latest/dg/tutorial-tuning-tables-distribution.html https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-best-dist-key.html https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-about-clusters-and-nodes https://docs.aws.amazon.com/redshift/latest/mgmt/enhanced-vpc-working-with-endpoints.html https://docs.aws.amazon.com/redshift/latest/dg/c_designing-queries-best-practices.html https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-use-copy.html https://docs.aws.amazon.com/redshift/latest/dg/c_intro_STL_tables.html https://docs.aws.amazon.com/redshift/latest/dg/c_intro_STV_tables.html https://docs.aws.amazon.com/redshift/latest/dg/cm-c-implementing-workload-management.html https://docs.aws.amazon.com/redshift/latest/dg/wlm-short-query-acceleration.html

DynamoDB
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-design.html#bp-partition-key-partitions-adaptive https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/globaltables_monitoring.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-data-upload.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/globaltables_reqs_bestpractices.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-gsi-aggregation.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-gsi-overloading.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-indexes-gsi-sharding.html

Machine Learning
https://docs.aws.amazon.com/machine-learning/latest/dg/types-of-ml-models.html https://docs.aws.amazon.com/machine-learning/latest/dg/binary-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/regression-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/multiclass-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/ml-model-insights.html https://docs.aws.amazon.com/machine-learning/latest/dg/cross-validation.html https://docs.aws.amazon.com/machine-learning/latest/dg/creating-and-using-datasources.html https://docs.aws.amazon.com/machine-learning/latest/dg/creating-a-data-schema-for-amazon-ml.html https://docs.aws.amazon.com/machine-learning/latest/dg/amazon-machine-learning-key-concepts.html

Pipeline
https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-how-tasks-scheduled.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-concepts-datanodes.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-concepts-databases.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-importexport-ddb-part1.html https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/datapipeline-related-services.html

Data Movement
https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/CHAP_Welcome.html

Athena
https://docs.aws.amazon.com/athena/latest/ug/access.html https://docs.aws.amazon.com/athena/latest/ug/encryption.html#encryption-options-S3-and-Athena https://docs.aws.amazon.com/athena/latest/ug/athena-aws-service-integrations.html

Glue
https://docs.aws.amazon.com/glue/latest/dg/components-overview.html