曙海教育集團
        全國報名免費熱線:4008699035 微信:shuhaipeixun
        或15921673576(微信同號) QQ:1299983702
        首頁 課程表 在線聊 報名 講師 品牌 QQ聊 活動 就業
         
        Big Data Business Intelligence for Criminal Intelligence Analysis培訓

         
           班級規模及環境--熱線:4008699035 手機:15921673576( 微信同號)
               每期人數限3到5人。
           上課時間和地點
        開課地址:【上海】同濟大學(滬西)/新城金郡商務樓(11號線白銀路站)【深圳分部】:電影大廈(地鐵一號線大劇院站) 【武漢分部】:佳源大廈【成都分部】:領館區1號【沈陽分部】:沈陽理工大學【鄭州分部】:錦華大廈【石家莊分部】:瑞景大廈【北京分部】:北京中山學院 【南京分部】:金港大廈
        最新開班 (連續班 、周末班、晚班):2025年4月7日--即將開課-----即將開課,歡迎垂詢
           實驗設備
             ☆資深工程師授課
                
                ☆注重質量 ☆邊講邊練

                ☆合格學員免費推薦工作
                ★實驗設備請點擊這兒查看★
           質量保障

                1、可免費在以后培訓班中重聽;
                2、免費提供課后技術支持,保障培訓效果。
                3、培訓合格學員可享受免費推薦就業機會。

        課程大綱
         

        Day 01
        =====
        Overview of Big Data Business Intelligence for Criminal Intelligence Analysis

        Case Studies from Law Enforcement - Predictive Policing
        Big Data adoption rate in Law Enforcement Agencies and how they are aligning their future operation around Big Data Predictive Analytics
        Emerging technology solutions such as gunshot sensors, surveillance video and social media
        Using Big Data technology to mitigate information overload
        Interfacing Big Data with Legacy data
        Basic understanding of enabling technologies in predictive analytics
        Data Integration & Dashboard visualization
        Fraud management
        Business Rules and Fraud detection
        Threat detection and profiling
        Cost benefit analysis for Big Data implementation
        Introduction to Big Data

        Main characteristics of Big Data -- Volume, Variety, Velocity and Veracity.
        MPP (Massively Parallel Processing) architecture
        Data Warehouses – static schema, slowly evolving dataset
        MPP Databases: Greenplum, Exadata, Teradata, Netezza, Vertica etc.
        Hadoop Based Solutions – no conditions on structure of dataset.
        Typical pattern : HDFS, MapReduce (crunch), retrieve from HDFS
        Apache Spark for stream processing
        Batch- suited for analytical/non-interactive
        Volume : CEP streaming data
        Typical choices – CEP products (e.g. Infostreams, Apama, MarkLogic etc)
        Less production ready – Storm/S4
        NoSQL Databases – (columnar and key-value): Best suited as analytical adjunct to data warehouse/database
        NoSQL solutions

        KV Store - Keyspace, Flare, SchemaFree, RAMCloud, Oracle NoSQL Database (OnDB)
        KV Store - Dynamo, Voldemort, Dynomite, SubRecord, Mo8onDb, DovetailDB
        KV Store (Hierarchical) - GT.m, Cache
        KV Store (Ordered) - TokyoTyrant, Lightcloud, NMDB, Luxio, MemcacheDB, Actord
        KV Cache - Memcached, Repcached, Coherence, Infinispan, EXtremeScale, JBossCache, Velocity, Terracoqua
        Tuple Store - Gigaspaces, Coord, Apache River
        Object Database - ZopeDB, DB40, Shoal
        Document Store - CouchDB, Cloudant, Couchbase, MongoDB, Jackrabbit, XML-Databases, ThruDB, CloudKit, Prsevere, Riak-Basho, Scalaris
        Wide Columnar Store - BigTable, HBase, Apache Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI
        Varieties of Data: Introduction to Data Cleaning issues in Big Data

        RDBMS – static structure/schema, does not promote agile, exploratory environment.
        NoSQL – semi structured, enough structure to store data without exact schema before storing data
        Data cleaning issues
        Hadoop

        When to select Hadoop?
        STRUCTURED - Enterprise data warehouses/databases can store massive data (at a cost) but impose structure (not good for active exploration)
        SEMI STRUCTURED data – difficult to carry out using traditional solutions (DW/DB)
        Warehousing data = HUGE effort and static even after implementation
        For variety & volume of data, crunched on commodity hardware – HADOOP
        Commodity H/W needed to create a Hadoop Cluster
        Introduction to Map Reduce /HDFS

        MapReduce – distribute computing over multiple servers
        HDFS – make data available locally for the computing process (with redundancy)
        Data – can be unstructured/schema-less (unlike RDBMS)
        Developer responsibility to make sense of data
        Programming MapReduce = working with Java (pros/cons), manually loading data into HDFS
        =====
        Day 02
        =====
        Big Data Ecosystem -- Building Big Data ETL (Extract, Transform, Load) -- Which Big Data Tools to use and when?

        Hadoop vs. Other NoSQL solutions
        For interactive, random access to data
        Hbase (column oriented database) on top of Hadoop
        Random access to data but restrictions imposed (max 1 PB)
        Not good for ad-hoc analytics, good for logging, counting, time-series
        Sqoop - Import from databases to Hive or HDFS (JDBC/ODBC access)
        Flume – Stream data (e.g. log data) into HDFS
        Big Data Management System

        Moving parts, compute nodes start/fail :ZooKeeper - For configuration/coordination/naming services
        Complex pipeline/workflow: Oozie – manage workflow, dependencies, daisy chain
        Deploy, configure, cluster management, upgrade etc (sys admin) :Ambari
        In Cloud : Whirr
        Predictive Analytics -- Fundamental Techniques and Machine Learning based Business Intelligence

        Introduction to Machine Learning
        Learning classification techniques
        Bayesian Prediction -- preparing a training file
        Support Vector Machine
        KNN p-Tree Algebra & vertical mining
        Neural Networks
        Big Data large variable problem -- Random forest (RF)
        Big Data Automation problem – Multi-model ensemble RF
        Automation through Soft10-M
        Text analytic tool-Treeminer
        Agile learning
        Agent based learning
        Distributed learning
        Introduction to Open source Tools for predictive analytics : R, Python, Rapidminer, Mahut
        Predictive Analytics Ecosystem and its application in Criminal Intelligence Analysis

        Technology and the investigative process
        Insight analytic
        Visualization analytics
        Structured predictive analytics
        Unstructured predictive analytics
        Threat/fraudstar/vendor profiling
        Recommendation Engine
        Pattern detection
        Rule/Scenario discovery – failure, fraud, optimization
        Root cause discovery
        Sentiment analysis
        CRM analytics
        Network analytics
        Text analytics for obtaining insights from transcripts, witness statements, internet chatter, etc.
        Technology assisted review
        Fraud analytics
        Real Time Analytic
        =====
        Day 03
        =====
        Real Time and Scalable Analytics Over Hadoop

        Why common analytic algorithms fail in Hadoop/HDFS
        Apache Hama- for Bulk Synchronous distributed computing
        Apache SPARK- for cluster computing and real time analytic
        CMU Graphics Lab2- Graph based asynchronous approach to distributed computing
        KNN p -- Algebra based approach from Treeminer for reduced hardware cost of operation
        Tools for eDiscovery and Forensics

        eDiscovery over Big Data vs. Legacy data – a comparison of cost and performance
        Predictive coding and Technology Assisted Review (TAR)
        Live demo of vMiner for understanding how TAR enables faster discovery
        Faster indexing through HDFS – Velocity of data
        NLP (Natural Language processing) – open source products and techniques
        eDiscovery in foreign languages -- technology for foreign language processing
        Big Data BI for Cyber Security – Getting a 360-degree view, speedy data collection and threat identification

        Understanding the basics of security analytics -- attack surface, security misconfiguration, host defenses
        Network infrastructure / Large datapipe / Response ETL for real time analytic
        Prescriptive vs predictive – Fixed rule based vs auto-discovery of threat rules from Meta data
        Gathering disparate data for Criminal Intelligence Analysis

        Using IoT (Internet of Things) as sensors for capturing data
        Using Satellite Imagery for Domestic Surveillance
        Using surveillance and image data for criminal identification
        Other data gathering technologies -- drones, body cameras, GPS tagging systems and thermal imaging technology
        Combining automated data retrieval with data obtained from informants, interrogation, and research
        Forecasting criminal activity
        =====
        Day 04
        =====
        Fraud prevention BI from Big Data in Fraud Analytics

        Basic classification of Fraud Analytics -- rules-based vs predictive analytics
        Supervised vs unsupervised Machine learning for Fraud pattern detection
        Business to business fraud, medical claims fraud, insurance fraud, tax evasion and money laundering
        Social Media Analytics -- Intelligence gathering and analysis

        How Social Media is used by criminals to organize, recruit and plan
        Big Data ETL API for extracting social media data
        Text, image, meta data and video
        Sentiment analysis from social media feed
        Contextual and non-contextual filtering of social media feed
        Social Media Dashboard to integrate diverse social media
        Automated profiling of social media profile
        Live demo of each analytic will be given through Treeminer Tool
        Big Data Analytics in image processing and video feeds

        Image Storage techniques in Big Data -- Storage solution for data exceeding petabytes
        LTFS (Linear Tape File System) and LTO (Linear Tape Open)
        GPFS-LTFS (General Parallel File System - Linear Tape File System) -- layered storage solution for Big image data
        Fundamentals of image analytics
        Object recognition
        Image segmentation
        Motion tracking
        3-D image reconstruction
        Biometrics, DNA and Next Generation Identification Programs

        Beyond fingerprinting and facial recognition
        Speech recognition, keystroke (analyzing a users typing pattern) and CODIS (combined DNA Index System)
        Beyond DNA matching: using forensic DNA phenotyping to construct a face from DNA samples
        Big Data Dashboard for quick accessibility of diverse data and display :

        Integration of existing application platform with Big Data Dashboard
        Big Data management
        Case Study of Big Data Dashboard: Tableau and Pentaho
        Use Big Data app to push location based services in Govt.
        Tracking system and management
        =====
        Day 05
        =====
        How to justify Big Data BI implementation within an organization:

        Defining the ROI (Return on Investment) for implementing Big Data
        Case studies for saving Analyst Time in collection and preparation of Data – increasing productivity
        Revenue gain from lower database licensing cost
        Revenue gain from location based services
        Cost savings from fraud prevention
        An integrated spreadsheet approach for calculating approximate expenses vs. Revenue gain/savings from Big Data implementation.
        Step by Step procedure for replacing a legacy data system with a Big Data System

        Big Data Migration Roadmap
        What critical information is needed before architecting a Big Data system?
        What are the different ways for calculating Volume, Velocity, Variety and Veracity of data
        How to estimate data growth
        Case studies
        Review of Big Data Vendors and review of their products.

        Accenture
        APTEAN (Formerly CDC Software)
        Cisco Systems
        Cloudera
        Dell
        EMC
        GoodData Corporation
        Guavus
        Hitachi Data Systems
        Hortonworks
        HP
        IBM
        Informatica
        Intel
        Jaspersoft
        Microsoft
        MongoDB (Formerly 10Gen)
        MU Sigma
        Netapp
        Opera Solutions
        Oracle
        Pentaho
        Platfora
        Qliktech
        Quantum
        Rackspace
        Revolution Analytics
        Salesforce
        SAP
        SAS Institute
        Sisense
        Software AG/Terracotta
        Soft10 Automation
        Splunk
        Sqrrl
        Supermicro
        Tableau Software
        Teradata
        Think Big Analytics
        Tidemark Systems
        Treeminer
        VMware (Part of EMC)
        Q/A session

         
          備.案.號:滬ICP備08026168號-1 .(2024年07月24日)....................
        久久激情亚洲精品无码?V| 亚洲综合丁香婷婷六月香| 亚洲人成在线中文字幕| 亚洲熟女少妇一区二区| 国产成人亚洲精品无码AV大片| 自拍偷区亚洲国内自拍| 亚洲不卡在线观看| 亚洲国产精品成人久久久| 亚洲激情黄色小说| 亚洲国产日韩在线人成下载| 亚洲黄色在线播放| 亚洲妓女综合网99| 久久精品国产99国产精品亚洲| 亚洲视频在线不卡| 亚洲乱码无限2021芒果| 亚洲一区二区三区久久| 亚洲AV无码专区在线亚| 亚洲自偷自偷在线成人网站传媒| 亚洲中文字幕久久精品无码VA| 亚洲国产日韩综合久久精品| 亚洲综合校园春色| 亚洲欧美aⅴ在线资源| 亚洲AⅤ男人的天堂在线观看| 久久久久亚洲国产AV麻豆| 黑人粗长大战亚洲女2021国产精品成人免费视频| 亚洲国产精品免费观看| 亚洲精品色播一区二区| 亚洲国产精品久久久久秋霞小| 亚洲成a∧人片在线观看无码| 色九月亚洲综合网| 亚洲精品综合久久| 国产亚洲精品资在线| 久久精品国产亚洲综合色| 久久精品国产亚洲av成人| 精品无码一区二区三区亚洲桃色 | 91亚洲一区二区在线观看不卡| 日产亚洲一区二区三区| 亚洲国产成a人v在线| 亚洲依依成人亚洲社区| 无码国产亚洲日韩国精品视频一区二区三区| 亚洲AV网一区二区三区 |