Janssens Jeroen, Nieuwdorp Thijs / Янссенс Йерун, Ньюдорп Тийс - Python Polars: The Definitive Guide / Python Polars: Подробное руководство [2025, PDF/EPUB, ENG]
Главная »
Литература
» Книги FB2 » Учебно-техническая литература
|
| Статистика раздачи | |
| Размер: 21.79 MB | Зарегистрирован: 6 месяца 4 дня | Скачано: 214 раза | |
| Работает мультитрекерная раздача | |
|
Раздают: 1 [ 0 KB/s ]
Подробная статистика пиров
|
|
|
| Автор | Сообщение | |||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
Python Polars: The Definitive Guide / Python Polars: Подробное руководство
Год издания: 2025 Автор: Janssens Jeroen, Nieuwdorp Thijs / Янссенс Йерун, Ньюдорп Тийс Издательство: O’Reilly Media, Inc. ISBN: 978-1-098-15608-4 Язык: Английский Формат: PDF/EPUB Качество: Издательский макет или текст (eBook) Интерактивное оглавление: Да Количество страниц: 504 Описание: Unlock the power of Polars, a Python package for transforming, analyzing, and visualizing data. In this hands-on guide, Jeroen Janssens and Thijs Nieuwdorp walk you through every feature of Polars, showing you how to use it for real-world tasks like data wrangling, exploratory data analysis, building pipelines, and more. Whether you're a seasoned data professional or new to data science, you'll quickly master Polars' expressive API and its underlying concepts. You don't need to have experience with pandas, but if you do, this book will help you make a seamless transition. The many practical examples and real-world datasets are available on GitHub, so you can easily follow along. Process data from CSV, Parquet, spreadsheets, databases, and the cloud Get a solid understanding of Expressions, the building blocks of every query Handle complex data types, including text, time, and nested structures Use both eager and lazy APIs, and know when to use each Visualize your data with Altair, hvPlot, plotnine, and Great Tables Extend Polars with your own Python functions and Rust plugins Leverage GPU acceleration to boost performance even further Откройте для себя возможности Polars, пакета Python для преобразования, анализа и визуализации данных. В этом практическом руководстве Йерун Janssen и Тийс Ньюдорп расскажут вам обо всех функциях Polars, показывая, как использовать их для решения реальных задач, таких как обработка данных, поисковый анализ, построение конвейеров и многое другое. Независимо от того, являетесь ли вы опытным специалистом в области обработки данных или новичком в data science, вы быстро освоите выразительный API Polars и лежащие в его основе концепции. Вам не обязательно иметь опыт работы с pandas, но если он у вас есть, эта книга поможет вам быстро освоиться. На GitHub доступно множество практических примеров и наборов данных из реального мира, так что вы можете легко ознакомиться с ними. Обрабатывайте данные из CSV, Parquet, электронных таблиц, баз данных и облака Получите четкое представление о выражениях, составляющих основу каждого запроса Обрабатывать сложные типы данных, включая текст, время и вложенные структуры Используйте как активные, так и ленивые API и знайте, когда использовать каждый из них Визуализируйте свои данные с помощью Altair, hvPlot, plotnine и великолепных таблиц Расширяйте поляризацию с помощью собственных функций Python и плагинов Rust Используйте ускорение графического процессора для еще большего повышения производительности ОглавлениеForeword xviiPreface xxi Part I. Begin 1. Introducing Polars 3 What Is This Thing Called Polars? 4 Key Features 4 Key Concepts 4 Advantages 5 Why You Should Use Polars 5 Performance 6 Usability 6 Popularity 7 Sustainability 8 Polars Compared to Other Data Processing Packages 8 Why We Focus on Python Polars 10 How This Book Is Organized 10 An ETL Showcase 11 Extract 12 Bonus: Visualizing Neighborhoods and Stations 17 Transform 21 Bonus: Visualizing Daily Trips per Borough 26 Load 28 Bonus: Becoming Faster by Being Lazy 29 Takeaways 32 2. Getting Started 33 Setting Up Your Environment 33 Downloading the Project 34 Installing uv 35 Installing the Project 35 Working with the Virtual Environment 35 Verifying Your Installation 36 Crash Course in JupyterLab 37 Keyboard Shortcuts 38 Installing Polars on Other Projects 39 All Optional Dependencies 40 Optional Dependencies for Interoperability 40 Optional Dependencies for Working with Spreadsheets 40 Optional Dependencies for Working with Databases 41 Optional Dependencies for Working with Remote Filesystems 41 Optional Dependencies for Other I/O Formats 41 Optional Dependencies for Extra Functionality 42 Installing Optional Dependencies 42 Configuring Polars 42 Temporary Configuration Using a Context Manager 43 Local Configuration Using a Decorator 46 Compiling Polars from Scratch 46 Edge Case: Very Large Datasets 47 Edge Case: Processors Lacking AVX Support 48 Takeaways 48 3. Moving from pandas to Polars 49 Animals 50 Similarities to Recognize 50 Appearances to Appreciate 51 Differences in Code 51 Differences in Display 52 Concepts to Unlearn 57 Index 57 Axes 58 Indexing and Slicing 59 Eagerness 61 Relaxedness 63 Syntax to Forget 64 Common Operations Side By Side 65 To and From pandas 69 Takeaways 70 Part II. Form 4. Data Structures and Data Types 73 Series, DataFrames, and LazyFrames 73 Data Types 75 Nested Data Types 77 Missing Values 79 Data Type Conversion 84 Takeaways 86 5. Eager and Lazy APIs 87 Eager API: DataFrame 87 Lazy API: LazyFrame 89 Performance Differences 90 Functionality Differences 91 Attributes 92 Aggregation Methods 92 Computation Methods 93 Descriptive Methods 93 GroupBy Methods 94 Exporting Methods 94 Manipulation and Selection Methods 95 Miscellaneous Methods 97 Tips and Tricks 98 Going from LazyFrame to DataFrame and Vice Versa 98 Joining a DataFrame with a LazyFrame 99 Caching Intermittent Results 100 Takeaways 101 6. Reading and Writing Data 103 Format Overview 104 Reading CSV Files 105 Parsing Missing Values Correctly 107 Reading Files with Encodings Other Than UTF-8 108 Reading Excel Spreadsheets 110 Working with Multiple Files 111 Reading Parquet 114 Reading JSON and NDJSON 115 JSON 115 NDJSON 118 Other File Formats 120 Querying Databases 121 Writing Data 123 CSV Format 123 Excel Format 124 Parquet Format 124 Other Considerations 125 Takeaways 125 Part III. Express 7. Beginning Expressions 129 Methods and Namespaces 131 Expressions by Example 131 Selecting Columns with Expressions 132 Creating New Columns with Expressions 133 Filtering Rows with Expressions 135 Aggregating with Expressions 135 Sorting Rows with Expressions 136 The Definition of an Expression 137 Properties of Expressions 139 Creating Expressions 141 From Existing Columns 142 From Literal Values 143 From Ranges 145 Other Functions to Create Expressions 146 Renaming Expressions 147 Expressions Are Idiomatic 149 Takeaways 151 8. Continuing Expressions 153 Types of Operations 154 Example A: Element-Wise Operations 155 Example B: Operations That Summarize to One 155 Example C: Operations That Summarize to One or More 156 Example D: Operations That Extend 156 Element-Wise Operations 157 Operations That Perform Mathematical Transformations 157 Operations Related to Trigonometry 159 Operations That Round and Categorize 160 Operations for Missing or Infinite Values 161 Other Operations 163 Nonreducing Series-Wise Operations 164 Operations That Accumulate 164 Operations That Fill and Shift 166 Operations Related to Duplicate Values 167 Operations That Compute Rolling Statistics 168 Operations That Sort 170 Other Operations 171 Series-Wise Operations That Summarize to One 172 Operations That Are Quantifiers 173 Operations That Compute Statistics 174 Operations That Count 176 Other Operations 178 Series-Wise Operations That Summarize to One or More 179 Operations Related to Unique Values 179 Operations That Select 180 Operations That Drop Missing Values 181 Other Operations 182 Series-Wise Operations That Extend 185 Takeaways 185 9. Combining Expressions 187 Inline Operators Versus Methods 188 Arithmetic Operations 190 Comparison Operations 191 Boolean Algebra Operations 195 Bitwise Operations 197 Using Functions 199 When, Then, Otherwise 202 Takeaways 204 Part IV. Transform 10. Selecting and Creating Columns 209 Selecting Columns 211 Introducing Selectors 212 Selecting Based on Name 213 Selecting Based on Data Type 214 Selecting Based on Position 216 Combining Selectors 218 Creating Columns 220 Related Column Operations 225 Dropping 225 Renaming 225 Stacking 226 Adding Row Indices 227 Takeaways 227 11. Filtering and Sorting Rows 229 Filtering Rows 230 Filtering Based on Expressions 230 Filtering Based on Column Names 231 Filtering Based on Constraints 232 Sorting Rows 233 Sorting Based on a Single Column 234 Sorting in Reverse 235 Sorting Based on Multiple Columns 235 Sorting Based on Expressions 236 Sorting Nested Data Types 237 Related Row Operations 239 Filtering Missing Values 239 Slicing 240 Top and Bottom 241 Sampling 241 Semi-Joins 241 Takeaways 242 12. Working with Textual, Temporal, and Nested Data Types 245 String 246 String Methods 246 String Examples 248 Categorical 252 Categorical Methods 253 Categorical Examples 253 Enum 256 Temporal 257 Temporal Methods 257 Temporal Examples 259 List 263 List Methods 263 List Examples 265 Array 267 Array Methods 267 Array Examples 268 Struct 270 Struct Methods 270 Struct Examples 271 Takeaways 274 13. Summarizing and Aggregating 275 Split, Apply, and Combine 276 GroupBy Context 276 The Descriptives 279 Advanced Methods 284 Row-Wise Aggregations 289 Window Functions in Selection Context 291 Dynamic Grouping 293 Rolling Aggregations 294 Upsampling 297 Takeaways 299 14. Joining and Concatenating 301 Joining 301 Join Strategies 302 Joining on Multiple Columns 306 Validation 306 Inexact Joining 308 Inexact Join Strategies 310 Additional Fine-Tuning 312 Use Case: Marketing Campaign Attribution 312 Vertical and Horizontal Concatenation 316 Vertical 317 Horizontal 318 Diagonal 318 Align 319 Relaxed 322 Stacking 323 Appending 324 Extending 324 Takeaways 325 15. Reshaping 327 Wide Versus Long DataFrames 327 Pivot to a Wider DataFrame 330 Unpivot to a Longer DataFrame 335 Transposing 337 Exploding 339 Partition into Multiple DataFrames 342 Takeaways 345 Part V. Advance 16. Visualizing Data 349 NYC Bike Trips 351 Built-In Plotting with Altair 353 Introducing Altair 353 Methods in the Plot Namespaces 354 Plotting DataFrames 355 Too Large to Handle 357 Plotting Series 359 pandas-Like Plotting with hvPlot 363 Introducing hvPlot 363 A First Plot 364 Methods in the hvPlot Namespace 365 pandas as Backup 366 Manual Transformations 367 Changing the Plotting Backend 368 Plotting Points on a Map 369 Composing Plots 369 Adding Interactive Widgets 371 Publication-Quality Graphics with plotnine 372 Introducing plotnine 373 Plots for Exploration 373 Plots for Communication 377 Styling DataFrames With Great Tables 381 Takeaways 386 17. Extending Polars 387 User-Defined Functions in Python 387 Applying a Function to Elements 388 Applying a Function to a Series 390 Applying a Function to Groups 391 Applying a Function to an Expression 394 Applying a Function to a DataFrame or LazyFrame 395 Registering Your Own Namespace 396 Polars Plugins in Rust 397 Prerequisites 398 The Anatomy of a Plugin Project 398 The Plugin 398 Compiling the Plugin 401 Performance Benchmark 401 Register Arguments 402 Using a Rust Crate 405 Use Case: geo 405 Takeaways 416 18. Polars Internals 417 Polars’ Architecture 417 Arrow 419 Multithreaded Computations and SIMD Operations 421 The String Data Type in Memory 422 ChunkedArrays in Series 423 Query Optimization 424 LazyFrame Scan-Level Optimizations 425 Other Optimizations 427 Checking Your Expressions 429 meta Namespace Overview 429 meta Namespace Examples 430 Profiling Polars 432 Tests in Polars 434 Comparing DataFrames and Series 435 Common Antipatterns 437 Using Brackets for Column Selection 437 Misusing Collect 437 Using Python Code in your Polars Queries 438 Takeaways 439 Appendix: Accelerating Polars with the GPU 441 Index 461
|
|||||||||||||||||||||
Главная »
Литература
» Книги FB2 » Учебно-техническая литература
|
Текущее время: 05-Дек 21:20
Часовой пояс: UTC + 5
Вы не можете начинать темы
Вы не можете отвечать на сообщения Вы не можете редактировать свои сообщения Вы не можете удалять свои сообщения Вы не можете голосовать в опросах Вы не можете прикреплять файлы к сообщениям Вы можете скачивать файлы |






