Library Logo

Data simplification : taming information with open source tools / (Record no. 247302)

000 -LEADER
fixed length control field 06432cam a2200601Ii 4500
001 - CONTROL NUMBER
control field ocn944961030
003 - CONTROL NUMBER IDENTIFIER
control field OCoLC
005 - DATE AND TIME OF LATEST TRANSACTION
control field 20190328114814.0
006 - FIXED-LENGTH DATA ELEMENTS--ADDITIONAL MATERIAL CHARACTERISTICS
fixed length control field m o d
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION
fixed length control field cr cnu|||unuuu
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 160317t20162016mau ob 001 0 eng d
040 ## - CATALOGING SOURCE
Original cataloging agency N$T
Language of cataloging eng
Description conventions rda
-- pn
Transcribing agency N$T
Modifying agency EBLCP
-- N$T
-- OPELS
-- OCLCF
-- YDXCP
-- CDX
-- UMI
-- AZK
-- TOH
-- STF
-- DEBBG
-- COO
-- DEBSZ
-- VGM
-- IUL
-- VT2
-- U3W
-- D6H
-- UOK
-- CEF
-- KSU
-- OCLCQ
-- AU@
-- OCLCQ
-- WYU
-- TKN
019 ## -
-- 961332310
-- 961514762
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 9780128038543
Qualifying information (electronic bk.)
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 0128038543
Qualifying information (electronic bk.)
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 0128037814
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 9780128037812
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
Canceled/invalid ISBN 9780128037812
035 ## - SYSTEM CONTROL NUMBER
System control number (OCoLC)944961030
Canceled/invalid control number (OCoLC)961332310
-- (OCoLC)961514762
050 #4 - LIBRARY OF CONGRESS CALL NUMBER
Classification number QA76.76.S46
072 #7 - SUBJECT CATEGORY CODE
Subject category code COM
Subject category code subdivision 051390
Source bisacsh
072 #7 - SUBJECT CATEGORY CODE
Subject category code COM
Subject category code subdivision 051440
Source bisacsh
072 #7 - SUBJECT CATEGORY CODE
Subject category code COM
Subject category code subdivision 051230
Source bisacsh
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 005.3
Edition number 23
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name Berman, Jules J.,
Relator term author.
245 10 - TITLE STATEMENT
Title Data simplification : taming information with open source tools /
Medium [electronic resource]
Statement of responsibility, etc. Jules J. Berman.
264 #1 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Place of production, publication, distribution, manufacture Cambridge, MA :
Name of producer, publisher, distributor, manufacturer Morgan Kaufmann is an imprint of Elsevier,
Date of production, publication, distribution, manufacture, or copyright notice 2016.
264 #4 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Date of production, publication, distribution, manufacture, or copyright notice �2016
300 ## - PHYSICAL DESCRIPTION
Extent 1 online resource
336 ## - CONTENT TYPE
Content type term text
Content type code txt
Source rdacontent
337 ## - MEDIA TYPE
Media type term computer
Media type code c
Source rdamedia
338 ## - CARRIER TYPE
Carrier type term online resource
Carrier type code cr
Source rdacarrier
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc Includes bibliographical references and index.
588 0# - SOURCE OF DESCRIPTION NOTE
Source of description note Online resource; title from PDF title page (EBSCO, viewed March 21, 2016).
520 ## - SUMMARY, ETC.
Summary, etc. Data Simplification: Taming Information With Open Source Tools addresses the simple fact that modern data is too big and complex to analyze in its native form. Data simplification is the process whereby large and complex data is rendered usable. Complex data must be simplified before it can be analyzed, but the process of data simplification is anything but simple, requiring a specialized set of skills and tools. This book provides data scientists from every scientific discipline with the methods and tools to simplify their data for immediate analysis or long-term storage in a form that can be readily repurposed or integrated with other data. Drawing upon years of practical experience, and using numerous examples and use cases, Jules Berman discusses the principles, methods, and tools that must be studied and mastered to achieve data simplification, open source tools, free utilities and snippets of code that can be reused and repurposed to simplify data, natural language processing and machine translation as a tool to simplify data, and data summarization and visualization and the role they play in making data useful for the end user.
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note Front cover; Data Simplification: Taming Information With Open Source Tools; Copyright; Dedication; Contents; Foreword; Preface; Organization of this book; Chapter Organization; How to Read this Book; Nota Bene; Glossary; References; Author Biography; Chapter 1: The Simple Life; 1.1. Simplification Drives Scientific Progress; 1.2. The Human Mind is a Simplifying Machine; 1.3. Simplification in Nature; 1.4. The Complexity Barrier; 1.5. Getting Ready; Open Source Tools; Perl; Python; Ruby; Text Editors; OpenOffice; LibreOffice; Command Line Utilities; Cygwin, Linux Emulation for Windows.
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note DOS Batch ScriptsLinux Bash Scripts; Interactive Line Interpreters; Package Installers; System Calls; Glossary; References; Chapter 2: Structuring Text; 2.1. The Meaninglessness of Free Text; 2.2. Sorting Text, the Impossible Dream; 2.3. Sentence Parsing; 2.4. Abbreviations; 2.5. Annotation and the Simple Science of Metadata; 2.6. Specifications Good, Standards Bad; Open Source Tools; ASCII; Regular Expressions; Format Commands; Converting Nonprintable Files to Plain-Text; Dublin Core; Glossary; References; Chapter 3: Indexing Text; 3.1. How Data Scientists Use Indexes.
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 3.2. Concordances and Indexed Lists3.3. Term Extraction and Simple Indexes; 3.4. Autoencoding and Indexing with Nomenclatures; 3.5. Computational Operations on Indexes; Open Source Tools; Word Lists; Doublet Lists; Ngram Lists; Glossary; References; Chapter 4: Understanding Your Data; 4.1. Ranges and Outliers; 4.2. Simple Statistical Descriptors; 4.3. Retrieving Image Information; 4.4. Data Profiling; 4.5. Reducing Data; Open Source Tools; Gnuplot; MatPlotLib; R, for Statistical Programming; Numpy; Scipy; ImageMagick; Displaying Equations in LaTex; Normalized Compression Distance.
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note Pearson's CorrelationThe Ridiculously Simple Dot Product; Glossary; References; Chapter 5: Identifying and Deidentifying Data; 5.1. Unique Identifiers; 5.2. Poor Identifiers, Horrific Consequences; 5.3. Deidentifiers and Reidentifiers; 5.4. Data Scrubbing; 5.5. Data Encryption and Authentication; 5.6. Timestamps, Signatures, and Event Identifiers; Open Source Tools; Pseudorandom Number Generators; UUID; Encryption and Decryption with OpenSSL; One-Way Hash Implementations; Steganography; Glossary; References; Chapter 6: Giving Meaning to Data; 6.1. Meaning and Triples.
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 6.2. Driving Down Complexity With Classifications6.3. Driving Up Complexity With Ontologies; 6.4. The Unreasonable Effectiveness of Classifications; 6.5. Properties That Cross Multiple Classes; Open Source Tools; Syntax for Triples; RDF Schema; RDF Parsers; Visualizing Class Relationships; Glossary; References; Chapter 7: Object-oriented Data; 7.1. The Importance of Self-Explaining Data; 7.2. Introspection and Reflection; 7.3. Object-Oriented Data Objects; 7.4. Working With Object-Oriented Data; Open Source Tools; Persistent Data; SQLite Databases; Glossary; References.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Open source software.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Data mining.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Database management.
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element COMPUTERS
General subdivision Programming
-- Open Source.
Source of heading or term bisacsh
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element COMPUTERS
General subdivision Software Development & Engineering
-- Tools.
Source of heading or term bisacsh
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element COMPUTERS
General subdivision Software Development & Engineering
-- General.
Source of heading or term bisacsh
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Data mining.
Source of heading or term fast
Authority record control number (OCoLC)fst00887946
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Database management.
Source of heading or term fast
Authority record control number (OCoLC)fst00888037
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Open source software.
Source of heading or term fast
Authority record control number (OCoLC)fst01046097
655 #4 - INDEX TERM--GENRE/FORM
Genre/form data or focus term Electronic books.
776 08 - ADDITIONAL PHYSICAL FORM ENTRY
Relationship information Print version:
Main entry heading Berman, Jules J.
Title Data simplification : taming information with open source tools.
Place, publisher, and date of publication Cambridge, MA : Elsevier, [2016]
International Standard Book Number 9780128037812
Record control number (DLC) 18934818
856 40 - ELECTRONIC LOCATION AND ACCESS
Materials specified ScienceDirect
Uniform Resource Identifier http://www.sciencedirect.com/science/book/9780128037812

No items available.

Last Updated on September 15, 2019
© Dhaka University Library. All Rights Reserved|Staff Login