• R/O
  • SSH

execsql: コミット

Default repository for execsql.py


コミットメタ情報

リビジョン907a882a546cb0dfd284517e2fb7d5a07b82501d (tree)
日時2020-04-30 22:31:01
作者Dreas Nielsen <dreas.nielsen@gmai...>
コミッターDreas Nielsen

ログメッセージ

Added section on performance to usage documentation.

変更サマリ

差分

diff -r 05a4dfe548b8 -r 907a882a546c doc/source/metacommands.rst
--- a/doc/source/metacommands.rst Sat Apr 04 06:29:26 2020 -0700
+++ b/doc/source/metacommands.rst Thu Apr 30 06:31:01 2020 -0700
@@ -2582,7 +2582,7 @@
25822582
25832583 Evaluates whether there is a database view of the given name. For
25842584 Access, this tests for the existence of a query of the given name.
2585-execsql queries the information schema tables, or analogous tables, for
2585+*execsql* queries the information schema tables, or analogous tables, for
25862586 this information. You must have permission to use these system tables.
25872587 If you do not, the alternative approach described for the
25882588 :ref:`TABLE_EXISTS <tableexists>` conditional can be used.
@@ -2629,7 +2629,7 @@
26292629 identify the quote and delimiter characters regardless of the setting
26302630 of the "-s" option.
26312631
2632-execsql will read CSV files containing newlines embeded in delimited
2632+*execsql* will read CSV files containing newlines embeded in delimited
26332633 text values. Scanning of a CSV file to determine the quote and
26342634 delimiter characters may produce incorrect results if most of the
26352635 physical lines scanned consist of text that makes up only part of a
@@ -2773,10 +2773,13 @@
27732773 recognized by Postgres (see
27742774 https://www.postgresql.org/docs/current/static/multibyte.html), a
27752775 slower loading routine will be used, with encoding conversion handled
2776-by execsql. Explicitly setting the quote and delimiter characters in
2777-the metacommand will also cause execsql to use its
2778-own import routine instead of the fast file reading features of
2779-Postgres or MySQL.
2776+by execsql.
2777+
2778+Performance will be better when importing data to an existing table
2779+rather than using the NEW or REPLACEMENT keywords to have the table
2780+created automatically. The time required for execsql to scan an
2781+entire file to determine data types can be much greater than the
2782+time required to import the file.
27802783
27812784 The sheet name used when importing data from a spreadsheet can be
27822785 either the sheet name, as it appears on the tab at the bottom of the
@@ -2802,23 +2805,6 @@
28022805 when :ref:`AUTOCOMMIT <autocommit>` is off,
28032806 the IMPORT metacommand generally should not be used within :ref:`BATCHes<batch>`.
28042807
2805-
2806-.. index::
2807- single: Postgres
2808-
2809-Some performance considerations when using IMPORT are:
2810-
2811- * Creating the table using a separate CREATE TABLE statement before
2812- the IMPORT metacommand will be faster than using the NEW or
2813- REPLACEMENT keywords. The time required for execsql to scan an
2814- entire file to determine data types can be much greater than the
2815- time required to import the file.
2816-
2817- * When Postgres' fast file reading feature is used, data are read and
2818- processed in chunks that are 32 kb in size. A larger or smaller value
2819- may give better performance, depending on system-specific conditions.
2820- The "-z" command-line option can be used to alter the buffer size.
2821-
28222808 .. index::
28232809 single: MySQL
28242810 single: MariaDB
diff -r 05a4dfe548b8 -r 907a882a546c doc/source/usage.rst
--- a/doc/source/usage.rst Sat Apr 04 06:29:26 2020 -0700
+++ b/doc/source/usage.rst Thu Apr 30 06:31:01 2020 -0700
@@ -24,7 +24,7 @@
2424 .. index::
2525 Script files
2626
27-SQL Statement Recognition and SQL Syntax
27+SQL Statement Recognition
2828 ----------------------------------------------
2929
3030 .. index::
@@ -184,6 +184,65 @@
184184 single: SQL Server
185185 single: MS-Access
186186
187+Performance
188+--------------------------
189+
190+Data import, export, and display, and logging, all affect the run time
191+of an *execsql* script. Performance of several of these operations can
192+be affected by configuration settings and metacommand usage.
193+
194+Importing Data
195+.......................
196+
197+When the NEW or REPLACEMENT keywords are used with the
198+:ref:`IMPORT <import>` metacommand, *execsql* first reads the entire
199+data set to determine the data type for each column. This ensures
200+that the data set will be imported successfully. However, the step of
201+diagnosing the data type of every column can take longer than the step
202+of importing the data and inserting it to the database table. If the
203+structure of the incoming data is known, then creating the table first
204+with a CREATE TABLE statement will result in better performance than
205+using the NEW or REPLACEMENT keywords.
206+
207+*execsql* will use the fast file reading features of Postgres and
208+MariaDB/MySQL when it can, ordinarily resulting in substantially faster
209+data import than using INSERT statements, as *execsql* does otherwise.
210+Any specific data handling operation that *execsql* is required to perform
211+on the imported data before inserting it to the database will prevent the
212+use of fast file reading features. To allow maximum performance:
213+
214+* Do not convert :ref:`empty strings to null <empty_strings>`
215+* Do not :ref:`exclude empty rows <empty_rows>`
216+* Ensure that the incoming data and the data table have the same columns,
217+ in the same order, and do not use the 'import common columns'
218+ :ref:`metacommand <import_common_cols>` or
219+ :ref:`configuration setting <import_only_common>`.
220+* Ensure compatibility of the encodings used in the data source and the database.
221+
222+Exporting Data
223+.......................
224+
225+Changing the size of the row buffer used for data export, using either the
226+appropriate :ref:`metacommand <config_export_row_buffer>` or
227+:ref:`configuration setting <setting_export_row_buffer>` may improve the
228+performance of data exports, particularly for large data sets.
229+
230+Logging of Data Variables
231+.............................
232+
233+*execsql* will always create or update a
234+:ref:`record of operations that have been carried out <logging>`. That
235+log ordinarily includes all assignments to :ref:`data variables <data_vars>`.
236+However, some scripts may make extensive use of data variables
237+(e.g., the `upsert scripts <https://osdn.net/projects/execsql-upsert/>`_),
238+and logging of large numbers of data variable assignments can reduce
239+a script's performance. Therefore, logging of data variables can be
240+turned off with either a :ref:`metacommand <logdatavars>` or a
241+:ref:`configuration setting <conf_log_datavars>`.
242+
243+
244+
245+
187246 DSN Connections
188247 --------------------------
189248
旧リポジトリブラウザで表示