snowplow
d261fcd7fae6
Merge branch 'release/r87-chichen-itza'
Alex Dean
2 months ago

12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
# Author:: Alex Dean (mailto:support@snowplowanalytics.com) # Copyright:: Copyright (c) 2012-2014 Snowplow Analytics Ltd # License:: Apache License Version 2.0 source "https://rubygems.org"
ruby "2.2.3"
ruby "2.3.1"
# ErmEtlRunner is a Ruby app (not a RubyGem) # built with Bundler, so we add in the # RubyGems it requires here. gem "contracts", "~> 0.9", "<= 0.11"
gem "elasticity", "~> 6.0.7"
gem "elasticity", "~> 6.0.10"
gem "sluice", "~> 0.4.0" gem "awrence", "~> 0.1.0" gem "snowplow-tracker", "~> 0.5.2" group :development do

13 14 15 16 17 18 19 20 21 22 23 13 14 15 16 17 18 19 20 21 22 23
tins (>= 1.6.0, < 2) diff-lcs (1.2.5) docile (1.1.5) domain_name (0.5.20160826) unf (>= 0.0.5, < 1.0.0)
elasticity (6.0.8)
elasticity (6.0.10)
fog (~> 1.0) rest-client (~> 1.0) unf (~> 0.1) excon (0.52.0) fission (0.5.0)
...
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153
DEPENDENCIES awrence (~> 0.1.0) contracts (~> 0.9, <= 0.11) coveralls
elasticity (~> 6.0.7)
elasticity (~> 6.0.10)
rspec (~> 2.14, >= 2.14.1) sluice (~> 0.4.0) snowplow-tracker (~> 0.5.2) warbler RUBY VERSION
ruby 2.2.3p0 (jruby 9.0.5.0)
ruby 2.3.1p0 (jruby 9.1.6.0)
BUNDLED WITH
1.12.5
1.13.7

43 44 45 46 47 48 49 50 51 52 43 44 45 46 47 48 49 50 51 52 53 54 55
$stderr.puts(e.message) exit 1 # Special retval so rest of pipeline knows not to continue rescue runner::NoDataToProcessError => e exit 3
# Special retval to flag previous pipeline is unfinished rescue runner::DirectoryNotEmptyError => e exit 4
# Catch any Snowplow error rescue runner::Error => e fatal.call(e) exit 1 rescue SystemExit => e

13 14 15 16 17 18 19 20 21 22 13 14 15 16 17 18 19 20 21 22
# Author:: Fred Blundun (mailto:support@snowplowanalytics.com) # Copyright:: Copyright (c) 2015 Snowplow Analytics Ltd # License:: Apache License Version 2.0
rvm install jruby bash -l -c 'rvm use jruby'
rvm install jruby-9.1.6.0 bash -l -c 'rvm use jruby-9.1.6.0'
gem install bundler bundle install rake

7 8 9 10 11 12 13 14 15 16 17 18 19 7 8 9 10 11 12 13 14 15 16 17 18 19
buckets: assets: s3://snowplow-hosted-assets # DO NOT CHANGE unless you are hosting the jarfiles etc yourself in your own bucket jsonpath_assets: # If you have defined your own JSON Schemas, add the s3:// path to your own JSON Path files in your own bucket here log: ADD HERE raw:
in: # Multiple in buckets are permitted - ADD HERE # e.g. s3://my-in-bucket - ADD HERE
in: # This is a YAML array of one or more in buckets - you MUST use hyphens before each entry in the array, as below - ADD HERE # e.g. s3://my-old-collector-bucket - ADD HERE # e.g. s3://my-new-collector-bucket
processing: ADD HERE archive: ADD HERE # e.g. s3://my-archive-bucket/raw enriched: good: ADD HERE # e.g. s3://my-out-bucket/enriched/good bad: ADD HERE # e.g. s3://my-out-bucket/enriched/bad
...
39 40 41 42 43 44 45 46 47 48 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
# Adjust your Hadoop cluster below jobflow: master_instance_type: m1.medium core_instance_count: 2 core_instance_type: m1.medium
core_instance_ebs: # Optional. Attach an EBS volume to each core instance. volume_size: 100 # Gigabytes volume_type: "gp2" volume_iops: 400 # Optional. Will only be used if volume_type is "io1" ebs_optimized: false # Optional. Will default to true
task_instance_count: 0 # Increase to use spot instances task_instance_type: m1.medium task_instance_bid: 0.015 # In USD. Adjust bid, or leave blank for non-spot-priced (i.e. on-demand) task instances bootstrap_failure_tries: 3 # Number of times to attempt the job in the event of bootstrap failures additional_info: # Optional JSON string for selecting additional features

26 27 28 29 30 31 32 33 26 27 28 29 30 31 32 33
require_relative 'snowplow-emr-etl-runner/runner' module Snowplow module EmrEtlRunner NAME = "snowplow-emr-etl-runner"
VERSION = "0.22.0"
VERSION = "0.23.0"
end end

56 57 58 59 60 61 62 63 64 65 66 56 57 58 59 60 61 62 63 64 65 66
opts.on('-n', '--enrichments ENRICHMENTS', 'enrichments directory') {|config| options[:enrichments_directory] = config} opts.on('-r', '--resolver RESOLVER', 'Iglu resolver file') {|config| options[:resolver_file] = config} opts.on('-d', '--debug', 'enable EMR Job Flow debugging') { |config| options[:debug] = true } opts.on('-s', '--start YYYY-MM-DD', 'optional start date *') { |config| options[:start] = config } opts.on('-e', '--end YYYY-MM-DD', 'optional end date *') { |config| options[:end] = config }
opts.on('-x', '--skip staging,s3distcp,emr{enrich,shred,elasticsearch},archive_raw', Array, 'skip work step(s)') { |config| options[:skip] = config }
opts.on('-x', '--skip staging,s3distcp,emr{enrich,shred,elasticsearch,archive_raw}', Array, 'skip work step(s)') { |config| options[:skip] = config }
opts.on('-E', '--process-enrich LOCATION', 'run enrichment only on specified location. Implies --skip staging,shred,archive_raw') { |config| options[:process_enrich_location] = config options[:skip] = %w(staging shred archive_raw) } opts.on('-S', '--process-shred LOCATION', 'run shredding only on specified location. Implies --skip staging,enrich,archive_raw') { |config|

19 20 21 22 23 24 25 26 27 28 19 20 21 22 23 24 25 26 27 28 29 30
module EmrEtlRunner include Contracts CompressionFormat = lambda { |s| %w(NONE GZIP).include?(s) }
VolumeTypes = lambda { |s| %w(standard gp2 io1).include?(s) } PositiveInt = lambda { |i| i.is_a?(Integer) && i > 0 }
# The Hash containing assets for Hadoop. AssetsHash = ({ :enrich => String, :shred => String
...
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 36 37 38 39 40 41 42 43 44 45
:start => Maybe[String], :end => Maybe[String], :skip => Maybe[ArrayOf[String]], :process_enrich_location => Maybe[String], :process_shred_location => Maybe[String]
}) # The Hash for the IP anonymization enrichment. AnonIpHash = ({ :enabled => Bool, :anon_octets => Num
}) # The Hash containing the buckets field from the configuration YAML BucketHash = ({ :assets => String,
...
82 83 84 85 86 87 88 89 90 91 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
:es_nodes_wan_only => Maybe[Bool], :maxerror => Maybe[Num], :comprows => Maybe[Num] })
# The Hash containing the configuration for a core instance using EBS. CoreInstanceEbsHash = ({ :volume_size => PositiveInt, :volume_type => VolumeTypes, :volume_iops => Maybe[PositiveInt], :ebs_optimized => Maybe[Bool] })
# The Hash containing effectively the configuration YAML. ConfigHash = ({ :aws => ({ :access_key_id => String, :secret_access_key => String,
...
108 109 110 111 112 113 114 115 116 117 112 113 114 115 116 117 118 119 120 121 122
}), :jobflow => ({ :master_instance_type => String, :core_instance_count => Num, :core_instance_type => String,
:core_instance_ebs => Maybe[CoreInstanceEbsHash],
:task_instance_count => Num, :task_instance_type => String, :task_instance_bid => Maybe[Num] }), :additional_info => Maybe[String],

67 68 69 70 71 72 73 74 75 76 77 78 67 68 69 70 71 72 73 74 75 76 77 78
@@failed_states = Set.new(%w(FAILED CANCELLED)) include Monitoring::Logging # Initializes our wrapper for the Amazon EMR client.
Contract Bool, Bool, Bool, Bool, Bool, ConfigHash, ArrayOf[String], String => EmrJob def initialize(debug, enrich, shred, elasticsearch, s3distcp, config, enrichments_array, resolver)
Contract Bool, Bool, Bool, Bool, Bool, Bool, ConfigHash, ArrayOf[String], String => EmrJob def initialize(debug, enrich, shred, elasticsearch, s3distcp, archive_raw, config, enrichments_array, resolver)
logger.debug "Initializing EMR jobflow" # Configuration custom_assets_bucket = self.class.get_hosted_assets_bucket(config[:aws][:s3][:buckets][:assets], config[:aws][:emr][:region])
...
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 90 91 92 93 94 95 96 97 98 99
output_codec_argument = output_codec == 'none' ? [] : ["--outputCodec" , output_codec] s3 = Sluice::Storage::S3::new_fog_s3_from( config[:aws][:s3][:region], config[:aws][:access_key_id], config[:aws][:secret_access_key])
# Check whether there are an even number of .lzo and .lzo.index files if config[:collectors][:format] == 'thrift' processing_location = Sluice::Storage::S3::Location.new(config[:aws][:s3][:buckets][:raw][:processing]) processing_file_count = Sluice::Storage::S3.list_files(s3, processing_location).size unless processing_file_count % 2 == 0 raise UnmatchedLzoFilesError, "Processing bucket contains #{processing_file_count} .lzo and .lzo.index files, expected an even number" end end
# Configure Elasticity with your AWS credentials Elasticity.configure do |c| c.access_key = config[:aws][:access_key_id] c.secret_key = config[:aws][:secret_access_key]
...
139 140 141 142 143 144 145 146 147 148 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160
@jobflow.visible_to_all_users = true @jobflow.instance_count = config[:aws][:emr][:jobflow][:core_instance_count] + 1 # +1 for the master instance @jobflow.master_instance_type = config[:aws][:emr][:jobflow][:master_instance_type] @jobflow.slave_instance_type = config[:aws][:emr][:jobflow][:core_instance_type]
unless config[:aws][:emr][:jobflow][:core_instance_ebs].nil? ebs_bdc = Elasticity::EbsBlockDeviceConfig.new ebs_bdc.volume_type = config[:aws][:emr][:jobflow][:core_instance_ebs][:volume_type] ebs_bdc.size_in_gb = config[:aws][:emr][:jobflow][:core_instance_ebs][:volume_size] ebs_bdc.volumes_per_instance = 1 if config[:aws][:emr][:jobflow][:core_instance_ebs][:volume_type] == "io1" ebs_bdc.iops = config[:aws][:emr][:jobflow][:core_instance_ebs][:volume_iops] end ebs_c = Elasticity::EbsConfiguration.new ebs_c.add_ebs_block_device_config(ebs_bdc) ebs_c.ebs_optimized = true unless config[:aws][:emr][:jobflow][:core_instance_ebs][:ebs_optimized].nil? ebs_c.ebs_optimized = config[:aws][:emr][:jobflow][:core_instance_ebs][:ebs_optimized] end @jobflow.set_core_ebs_configuration(ebs_c) end
if config[:collectors][:format] == 'thrift' if @legacy [ Elasticity::HadoopBootstrapAction.new('-c', 'io.file.buffer.size=65536'),
...
391 392 393 394 395 396 397 398 399 400 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425
get_elasticsearch_steps(config, assets, enrich, shred).each do |step| @jobflow.add_step(step) end end
if archive_raw # We need to copy our enriched events from HDFS back to S3 archive_raw_step = Elasticity::S3DistCpStep.new(legacy = @legacy) archive_raw_step.arguments = [ "--src" , csbr[:processing], "--dest" , self.class.partition_by_run(csbr[:archive], run_id), "--s3Endpoint" , s3_endpoint, "--deleteOnSuccess" ] archive_raw_step.name << ": Raw S3 Staging -> S3 Archive" @jobflow.add_step(archive_raw_step) end
self end # Create one step for each Elasticsearch target for each source for that target #
...
566 567 568 569 570 571 572 573 574 575 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606
logger.warn "Got connection timeout #{to}, waiting 5 minutes before checking jobflow again" sleep(300) rescue RestClient::InternalServerError => ise logger.warn "Got internal server error #{ise}, waiting 5 minutes before checking jobflow again" sleep(300)
rescue Elasticity::ThrottlingException => te logger.warn "Got Elasticity throttling exception #{te}, waiting 5 minutes before checking jobflow again" sleep(300) rescue ArgumentError => ae logger.warn "Got Elasticity argument error #{ae}, waiting 5 minutes before checking jobflow again" sleep(300)
rescue IOError => ioe logger.warn "Got IOError #{ioe}, waiting 5 minutes before checking jobflow again" sleep(300) end end

39 40 41 42 43 44 45 46 47 48 49 39 40 41 42 43 44 45
# Raised if there is no data to process # Not strictly an error, but used for control flow class NoDataToProcessError < Error end
# Raised if the .lzo and .lzo.index files aren't matched class UnmatchedLzoFilesError < Error end
end end

56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
unless @args[:skip].include?('emr') enrich = not(@args[:skip].include?('enrich')) shred = not(@args[:skip].include?('shred')) s3distcp = not(@args[:skip].include?('s3distcp')) elasticsearch = not(@args[:skip].include?('elasticsearch'))
archive_raw = not(@args[:skip].include?('archive_raw'))
# Keep relaunching the job until it succeeds or fails for a reason other than a bootstrap failure tries_left = @config[:aws][:emr][:bootstrap_failure_tries] while true begin tries_left -= 1
job = EmrJob.new(@args[:debug], enrich, shred, elasticsearch, s3distcp, @config, @enrichments_array, @resolver)
job = EmrJob.new(@args[:debug], enrich, shred, elasticsearch, s3distcp, archive_raw, @config, @enrichments_array, @resolver)
job.run(@config) break rescue BootstrapFailureError => bfe logger.warn "Job failed. #{tries_left} tries left..." if tries_left > 0
...
77 78 79 80 81 82 83 84 85 86 87 88 89 90 78 79 80 81 82 83 84 85 86 87
else raise end end end
end unless @args[:skip].include?('archive_raw') S3Tasks.archive_logs(@config)
end logger.info "Completed successfully" nil end

170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 170 171 172 173 174 175 176 177
true end end
# Moves (archives) the processed CloudFront logs to an archive bucket. # Prevents the same log files from being processed again. # # Parameters: # +config+:: the hash of configuration options Contract ConfigHash => nil def self.archive_logs(config) Monitoring::Logging::logger.debug 'Archiving CloudFront logs...' s3 = Sluice::Storage::S3::new_fog_s3_from( config[:aws][:s3][:region], config[:aws][:access_key_id], config[:aws][:secret_access_key]) # Get S3 locations processing_location = Sluice::Storage::S3::Location.new(config[:aws][:s3][:buckets][:raw][:processing]); archive_location = Sluice::Storage::S3::Location.new(config[:aws][:s3][:buckets][:raw][:archive]); # Attach date path if filenames include datestamp add_date_path = lambda { |filepath| if m = filepath.match('[^/]+\.(\d\d\d\d-\d\d-\d\d)-\d\d\.[^/]+\.gz$') filename = m[0] date = m[1] return date + '/' + filename else return filepath end } # Move all the files in the Processing Bucket Sluice::Storage::S3::move_files(s3, processing_location, archive_location, '.+', add_date_path) nil end
end end end

39 40 41 42 43 44 45 46 47 48 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
# Adjust your Hadoop cluster below jobflow: master_instance_type: m1.medium core_instance_count: 2 core_instance_type: m1.medium
core_instance_ebs: # Optional. Attach an EBS volume to each core instance. volume_size: 500 # Gigabytes volume_type: "gp2" volume_iops: 5000 ebs_optimized: false # Defaults to true
task_instance_count: 0 # Increase to use spot instances task_instance_type: m1.medium task_instance_bid: 0.015 # In USD. Adjust bid, or leave blank for non-spot-priced (i.e. on-demand) task instances bootstrap_failure_tries: 3 # Number of times to attempt the job in the event of bootstrap failures collectors:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
-- Copyright (c) 2015 Snowplow Analytics Ltd. All rights reserved. -- -- This program is licensed to you under the Apache License Version 2.0, -- and you may not use this file except in compliance with the Apache License Version 2.0. -- You may obtain a copy of the Apache License Version 2.0 at http://www.apache.org/licenses/LICENSE-2.0. -- -- Unless required by applicable law or agreed to in writing, -- software distributed under the Apache License Version 2.0 is distributed on an -- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -- See the Apache License Version 2.0 for the specific language governing permissions and limitations there under. -- -- Version: 0.8.0 -- URL: - -- -- Authors: Alex Dean -- Copyright: Copyright (c) 2015 Snowplow Analytics Ltd -- License: Apache License Version 2.0 -- Create manifest table CREATE TABLE atomic.manifest ( -- Timestamp for this run of the pipeline etl_tstamp timestamp, -- Timestamp for when the load transaction was committed commit_tstamp timestamp, -- Count of events loaded event_count bigint, -- How many shredded tables were loaded in this run shredded_cardinality int ) DISTSTYLE ALL SORTKEY (etl_tstamp); COMMENT ON TABLE "atomic"."manifest" IS '0.1.0';

12 13 14 15 16 17 18 19 20 21 22 12 13 14 15 16 17 18 19 20 21 22
# Author:: Alex Dean (mailto:support@snowplowanalytics.com) # Copyright:: Copyright (c) 2012-2014 Snowplow Analytics Ltd # License:: Apache License Version 2.0 source "https://rubygems.org"
ruby "2.2.3"
ruby "2.3.1"
# StorageLoader is a Ruby app (not a RubyGem) # built with Bundler, so we add in the # RubyGems it requires here. gem "sluice", "~> 0.4.0"

143 144 145 146 147 148 149 150 151 143 144 145 146 147 148 149 150 151
sluice (~> 0.4.0) snowplow-tracker (~> 0.5.2) warbler RUBY VERSION
ruby 2.2.3p0 (jruby 9.0.5.0)
ruby 2.3.1p0 (jruby 9.1.6.0)
BUNDLED WITH
1.12.5
1.13.7

13 14 15 16 17 18 19 20 21 22 13 14 15 16 17 18 19 20 21 22
# Author:: Fred Blundun (mailto:support@snowplowanalytics.com) # Copyright:: Copyright (c) 2015 Snowplow Analytics Ltd # License:: Apache License Version 2.0
rvm install jruby bash -l -c 'rvm use jruby'
rvm install jruby-9.1.6.0 bash -l -c 'rvm use jruby-9.1.6.0'
gem install bundler bundle install rake

26 27 28 29 30 31 32 33 26 27 28 29 30 31 32 33
require_relative 'snowplow-storage-loader/sanitization' module Snowplow module StorageLoader NAME = "snowplow-storage-loader"
VERSION = "0.8.0"
VERSION = "0.9.0"
end end

51 52 53 54 55 56 57 58 59 60 51 52 53 54 55 56 57 58 59 60 61 62 63
config[:aws][:secret_access_key]) # First let's get our statements for shredding (if any) shredded_statements = get_shredded_statements(config, target, s3)
# Now let's get the manifest statement manifest_statement = get_manifest_statement(target[:table], shredded_statements.length)
# Build our main transaction, consisting of COPY and COPY FROM JSON # statements, and potentially also a set of table ANALYZE statements. atomic_events_location = if OLD_ENRICHED_PATTERN.match(config[:enrich][:versions][:hadoop_shred]) :enriched
...
73 74 75 76 77 78 79 80 81 82 83 76 77 78 79 80 81 82 83 84 85 86
# Of the form "run=xxx/atomic-events" altered_enriched_subdirectory = ALTERED_ENRICHED_PATTERN.match(altered_enriched_filepath.key)[1] [build_copy_from_tsv_statement(config, config[:aws][:s3][:buckets][:shredded][:good] + altered_enriched_subdirectory, target[:table], target[:maxerror])] else [build_copy_from_tsv_statement(config, config[:aws][:s3][:buckets][:enriched][:good], target[:table], target[:maxerror])]
end + shredded_statements.map(&:copy)
end + shredded_statements.map(&:copy) + [manifest_statement]
credentials = [config[:aws][:access_key_id], config[:aws][:secret_access_key]] status = PostgresLoader.execute_transaction(target, copy_statements) unless status == []
...
147 148 149 150 151 152 153 154 155 156 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181
) } end end
# Generates the SQL statement for updating the # manifest table # # Parameters: # +events_table+:: the name of the events table being loaded # +shredded_cardinality+:: the number of shredded child events and contexts tables loaded in this run Contract String, Num => String def self.get_manifest_statement(events_table, shredded_cardinality) s = extract_schema(events_table) schema = if s.nil? then "" else "#{s}." end "INSERT INTO #{schema}manifest SELECT etl_tstamp, sysdate AS commit_tstamp, count(*) AS event_count, #{shredded_cardinality} AS shredded_cardinality FROM #{events_table} WHERE etl_tstamp IS NOT null GROUP BY 1 ORDER BY etl_tstamp DESC LIMIT 1; " end
# Looks at the events table to determine if there's # a schema we should use for the shredded type tables. # # Parameters: # +events_table+:: the events table to load into

36 37 38 39 40 41 42 43 44 45 46 36 37 38 39 40 41 42 43 44 45 46
s3 = Sluice::Storage::S3::new_fog_s3_from( config[:aws][:s3][:region], config[:aws][:access_key_id], config[:aws][:secret_access_key])
s3.host = region_to_safe_host([:aws][:s3][:region])
s3.host = region_to_safe_host(config[:aws][:s3][:region])
# Get S3 location of In Bucket plus local directory in_location = Sluice::Storage::S3::Location.new(config[:aws][:s3][:buckets][:shredded][:good]) download_dir = config[:storage][:download][:folder]
...
120 121 122 123 124 125 120 121 122 123 124 125
end end end end
end
end

1 2 3 4 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Release 87 Chichen Itza (2016-02-21) ------------------------------------ EmrEtlRunner: bump to 0.23.0 (#2960) EmrEtlRunner: bump JRuby version to 9.1.6.0 (#3050) EmrEtlRunner: bump Elasticity to 6.0.10 (#3013) EmrEtlRunner: remove AnonIpHash from contracts.rb (#2523) EmrEtlRunner: remove UnmatchedLzoFilesError check (#2740) EmrEtlRunner: use S3DistCp not Sluice for archive_raw step (#1977) EmrEtlRunner: add warning about the array of in buckets in config.yml (#2462) EmrEtlRunner: add dedicated return code of 4 for DirectoryNotEmptyError (#2546) EmrEtlRunner: add support for specifying EBS for Hadoop workers (#2950) EmrEtlRunner: add example EBS configuration to config.yml.sample (#3012) EmrEtlRunner: catch Elasticity ThrottlingExceptions while waiting for EMR (#3028) EmrEtlRunner: catch Elasticity ArgumentErrors while waiting for EMR (#3027) StorageLoader: bump to 0.9.0 (#2961) StorageLoader: bump JRuby version to 9.1.6.0 (#3051) StorageLoader: fix typo in S3Tasks.download_events (#2888) StorageLoader: update manifest table as part of Redshift load transaction (#2280) Redshift: added manifest table (#2265)
Release 86 Petra (2016-12-20) ----------------------------- Common: add AWS credentials to .travis.yml (#2963) Common: add CI/CD for Scala Hadoop Enrich (#2982) Common: add CI/CD for Scala Hadoop Shred (#2928)

74 75 76 77 78 79 80 81 82 83 84 74 75 76 77 78 79 80 81 82 83 84
limitations under the License. [travis-image]: https://travis-ci.org/snowplow/snowplow.png?branch=master [travis]: http://travis-ci.org/snowplow/snowplow
[release-image]: https://img.shields.io/badge/release-86_Petra-orange.svg?style=flat
[release-image]: https://img.shields.io/badge/release-87_Chichen_Itza-orange.svg?style=flat
[releases]: https://github.com/snowplow/snowplow/releases [license-image]: http://img.shields.io/badge/license-Apache--2-blue.svg?style=flat [license]: http://www.apache.org/licenses/LICENSE-2.0

1 1
r86-petra
r87-chichen-itza
About FluentSend Feedback