Merge bk-internal.mysql.com:/home/bk/mysql-4.0

into mashka.mysql.fi:/home/my/mysql-4.0

Merge bk-internal.mysql.com:/home/bk/mysql-4.0
into mashka.mysql.fi:/home/my/mysql-4.0
df98925f · unknown · 13a24a51 · 3d12a41d · df98925f · df98925f
Commit df98925f authored Apr 29, 2003 by unknown
Hide whitespace changes
Inline Side-by-side

Showing with 222 additions and 187 deletions

Docs/internals.texi Docs/internals.texi +219 -187

VC++Files/mysql.dsw VC++Files/mysql.dsw +3 -0

No files found.
--- a/Docs/internals.texi
+++ b/Docs/internals.texi
@@ -43,18 +43,18 @@ END-INFO-DIR-ENTRY
 @page
 @end titlepage

-@node Top, caching, (dir), (dir)
+@node Top, coding guidelines, (dir), (dir)

 @ifinfo
 This is a manual about @strong{MySQL} internals.
 @end ifinfo

 @menu
+* coding guidelines::           Coding Guidelines
 * caching::                     How MySQL Handles Caching
-* join_buffer_size::
+* join_buffer_size::            
 * flush tables::                How MySQL Handles @code{FLUSH TABLES}
-* filesort::                    How MySQL Does Sorting (@code{filesort})
-* coding guidelines::           Coding Guidelines
+* Algorithms::                  
 * mysys functions::             Functions In The @code{mysys} Library
 * DBUG::                        DBUG Tags To Use
 * protocol::                    MySQL Client/Server Protocol
@@ -67,7 +67,167 @@ This is a manual about @strong{MySQL} internals.
 @end menu


-@node caching, join_buffer_size, Top, Top
+@node coding guidelines, caching, Top, Top
+@chapter Coding Guidelines
+
+@itemize @bullet
+
+@item
+We use @uref{http://www.bitkeeper.com/, BitKeeper} for source management.
+
+@item
+You should use the @strong{MySQL} 4.0 source for all developments.
+
+@item
+If you have any questions about the @strong{MySQL} source, you can post these
+to @email{dev-public@@mysql.com} and we will answer them.  Please
+remember to not use this internal email list in public!
+
+@item
+Try to write code in a lot of black boxes that can be reused or use at
+least a clean, easy to change interface.
+
+@item
+Reuse code;  There is already a lot of algorithms in MySQL for list handling,
+queues, dynamic and hashed arrays, sorting, etc. that can be reused.
+
+@item
+Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/
+@code{my_malloc()} that you can find in the @code{mysys} library instead 
+of the direct system calls;  This will make your code easier to debug and 
+more portable.
+
+@item
+Try to always write optimized code, so that you don't have to
+go back and rewrite it a couple of months later.  It's better to
+spend 3 times as much time designing and writing an optimal function than
+having to do it all over again later on.
+
+@item
+Avoid CPU wasteful code, even where it does not matter, so that
+you will not develop sloppy coding habits.
+
+@item
+If you can write it in fewer lines, do it (as long as the code will not
+be slower or much harder to read).
+
+@item
+Don't use two commands on the same line.
+
+@item
+Do not check the same pointer for @code{NULL} more than once.
+
+@item
+Use long function and variable names in English.  This makes your code
+easier to read. 
+
+@item
+Use @code{my_var} as opposed to @code{myVar} or @code{MyVar} (@samp{_} 
+rather than dancing SHIFT to seperate words in identifiers).
+
+@item
+Think assembly - make it easier for the compiler to optimize your code.
+
+@item
+Comment your code when you do something that someone else may think
+is not ``trivial''.
+
+@item
+Use @code{libstring} functions (in the @file{strings} directory)
+instead of standard @code{libc} string functions whenever possible.
+
+@item
+Avoid using @code{malloc()} (its REAL slow);  For memory allocations 
+that only need to live for the lifetime of one thread, one should use
+@code{sql_alloc()} instead.
+
+@item
+Before making big design decisions, please first post a summary of
+what you want to do, why you want to do it, and how you plan to do
+it.  This way we can easily provide you with feedback and also
+easily discuss it thoroughly if some other developer thinks there is better
+way to do the same thing!
+
+@item
+Class names start with a capital letter.
+
+@item
+Structure types are @code{typedef}'ed to an all-caps identifier.
+
+@item
+Any @code{#define}'s are in all-caps.
+
+@item
+Matching @samp{@{} are in the same column.
+
+@item
+Put the @samp{@{} after a @code{switch} on the same line, as this gives 
+better overall indentation for the switch statement:
+
+@example
+switch (arg) @{
+@end example
+
+@item
+In all other cases, @samp{@{} and @samp{@}} should be on their own line, except
+if there is nothing inside @samp{@{} and @samp{@}}.
+
+@item
+Have a space after @code{if}
+
+@item
+Put a space after @samp{,} for function arguments
+
+@item
+Functions return @samp{0} on success, and non-zero on error, so you can do:
+
+@example
+if(a() || b() || c()) @{ error("something went wrong"); @}
+@end example
+
+@item
+Using @code{goto} is okay if not abused.
+
+@item
+Avoid default variable initalizations, use @code{LINT_INIT()} if the
+compiler complains after making sure that there is really no way
+the variable can be used uninitialized.
+
+@item
+Do not instantiate a class if you do not have to.
+
+@item
+Use pointers rather than array indexing when operating on strings.
+
+@end itemize
+
+Suggested mode in emacs:
+
+@example
+(load "cc-mode")
+(setq c-mode-common-hook '(lambda ()
+			    (turn-on-font-lock)
+			    (setq comment-column 48)))
+(setq c-style-alist
+      (cons
+       '("MY"
+	 (c-basic-offset . 2)
+	 (c-comment-only-line-offset . 0)
+	 (c-offsets-alist . ((statement-block-intro . +)
+			     (knr-argdecl-intro . 0)
+			     (substatement-open . 0)
+			     (label . -)
+			     (statement-cont . +)
+			     (arglist-intro . c-lineup-arglist-intro-after-paren)
+			     (arglist-close . c-lineup-arglist)
+			     ))
+	 )
+       c-style-alist))
+(c-set-style "MY")
+(setq c-default-style "MY")
+@end example
+
+@node caching, join_buffer_size, coding guidelines, Top
 @chapter How MySQL Handles Caching

 @strong{MySQL} has the following caches:
@@ -181,7 +341,7 @@ same algorithm described above to handle it.  (In other words, we store
 the same row combination several times into different buffers)
 @end itemize

-@node flush tables, filesort, join_buffer_size, Top
+@node flush tables, Algorithms, join_buffer_size, Top
 @chapter How MySQL Handles @code{FLUSH TABLES}

 @itemize @bullet
@@ -226,8 +386,19 @@ After this it will give other threads a chance to open the same tables.

 @end itemize

-@node filesort, coding guidelines, flush tables, Top
-@chapter How MySQL Does Sorting (@code{filesort})
+@node Algorithms, mysys functions, flush tables, Top
+@chapter Different algoritms used in MySQL
+
+MySQL uses a lot of different algorithms.  This chapter tries to describe
+some of these:
+
+@menu
+* filesort::                    
+* bulk-insert::                 
+@end menu
+
+@node filesort, bulk-insert, Algorithms, Algorithms
+@section How MySQL Does Sorting (@code{filesort})

 @itemize @bullet

@@ -266,169 +437,20 @@ and then we read the rows in the sorted order into a row buffer

 @end itemize

+@node bulk-insert,  , filesort, Algorithms
+@section Bulk insert

-@node coding guidelines, mysys functions, filesort, Top
-@chapter Coding Guidelines
-
-@itemize @bullet
-
-@item
-We use @uref{http://www.bitkeeper.com/, BitKeeper} for source management.
-
-@item
-You should use the @strong{MySQL} 4.0 source for all developments.
-
-@item
-If you have any questions about the @strong{MySQL} source, you can post these
-to @email{dev-public@@mysql.com} and we will answer them.  Please
-remember to not use this internal email list in public!
-
-@item
-Try to write code in a lot of black boxes that can be reused or use at
-least a clean, easy to change interface.
-
-@item
-Reuse code;  There is already a lot of algorithms in MySQL for list handling,
-queues, dynamic and hashed arrays, sorting, etc. that can be reused.
-
-@item
-Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/
-@code{my_malloc()} that you can find in the @code{mysys} library instead 
-of the direct system calls;  This will make your code easier to debug and 
-more portable.
-
-@item
-Try to always write optimized code, so that you don't have to
-go back and rewrite it a couple of months later.  It's better to
-spend 3 times as much time designing and writing an optimal function than
-having to do it all over again later on.
-
-@item
-Avoid CPU wasteful code, even where it does not matter, so that
-you will not develop sloppy coding habits.
-
-@item
-If you can write it in fewer lines, do it (as long as the code will not
-be slower or much harder to read).
-
-@item
-Don't use two commands on the same line.
-
-@item
-Do not check the same pointer for @code{NULL} more than once.
-
-@item
-Use long function and variable names in English.  This makes your code
-easier to read. 
-
-@item
-Use @code{my_var} as opposed to @code{myVar} or @code{MyVar} (@samp{_} 
-rather than dancing SHIFT to seperate words in identifiers).
-
-@item
-Think assembly - make it easier for the compiler to optimize your code.
-
-@item
-Comment your code when you do something that someone else may think
-is not ``trivial''.
-
-@item
-Use @code{libstring} functions (in the @file{strings} directory)
-instead of standard @code{libc} string functions whenever possible.
-
-@item
-Avoid using @code{malloc()} (its REAL slow);  For memory allocations 
-that only need to live for the lifetime of one thread, one should use
-@code{sql_alloc()} instead.
-
-@item
-Before making big design decisions, please first post a summary of
-what you want to do, why you want to do it, and how you plan to do
-it.  This way we can easily provide you with feedback and also
-easily discuss it thoroughly if some other developer thinks there is better
-way to do the same thing!
-
-@item
-Class names start with a capital letter.
-
-@item
-Structure types are @code{typedef}'ed to an all-caps identifier.
-
-@item
-Any @code{#define}'s are in all-caps.
-
-@item
-Matching @samp{@{} are in the same column.
-
-@item
-Put the @samp{@{} after a @code{switch} on the same line, as this gives 
-better overall indentation for the switch statement:
-
-@example
-switch (arg) @{
-@end example
-
-@item
-In all other cases, @samp{@{} and @samp{@}} should be on their own line, except
-if there is nothing inside @samp{@{} and @samp{@}}.
-
-@item
-Have a space after @code{if}
-
-@item
-Put a space after @samp{,} for function arguments
-
-@item
-Functions return @samp{0} on success, and non-zero on error, so you can do:
-
-@example
-if(a() || b() || c()) @{ error("something went wrong"); @}
-@end example
-
-@item
-Using @code{goto} is okay if not abused.
-
-@item
-Avoid default variable initalizations, use @code{LINT_INIT()} if the
-compiler complains after making sure that there is really no way
-the variable can be used uninitialized.
-
-@item
-Do not instantiate a class if you do not have to.
-
-@item
-Use pointers rather than array indexing when operating on strings.
-
-@end itemize
-
-Suggested mode in emacs:
-
-@example
-(load "cc-mode")
-(setq c-mode-common-hook '(lambda ()
-			    (turn-on-font-lock)
-			    (setq comment-column 48)))
-(setq c-style-alist
-      (cons
-       '("MY"
-	 (c-basic-offset . 2)
-	 (c-comment-only-line-offset . 0)
-	 (c-offsets-alist . ((statement-block-intro . +)
-			     (knr-argdecl-intro . 0)
-			     (substatement-open . 0)
-			     (label . -)
-			     (statement-cont . +)
-			     (arglist-intro . c-lineup-arglist-intro-after-paren)
-			     (arglist-close . c-lineup-arglist)
-			     ))
-	 )
-       c-style-alist))
-(c-set-style "MY")
-(setq c-default-style "MY")
-@end example
+Logic behind bulk insert optimisation is simple.

+Instead of writing each key value to b-tree (that is to keycache, but
+bulk insert code doesn't know about keycache) keys are stored in
+balanced binary (red-black) tree, in memory. When this tree reaches its
+memory limit it's writes all keys to disk (to keycache, that is).  But
+as key stream coming from the binary tree is already sorted inserting
+goes much faster, all the necessary pages are already in cache, disk
+access is minimized, etc.

-@node mysys functions, DBUG, coding guidelines, Top
+@node mysys functions, DBUG, Algorithms, Top
 @chapter Functions In The @code{mysys} Library

 Functions in @code{mysys}: (For flags see @file{my_sys.h})
@@ -624,6 +646,16 @@ Print query.
 * fieldtype codes::             
 * protocol functions::          
 * protocol version 2::          
+* 4.1 protocol changes::        
+* 4.1 field packet::            
+* 4.1 field desc::              
+* 4.1 ok packet::               
+* 4.1 end packet::              
+* 4.1 error packet::            
+* 4.1 prep init::               
+* 4.1 long data::               
+* 4.1 execute::                 
+* 4.1 binary result::           
 @end menu

 @node raw packet without compression, raw packet with compression, protocol, protocol
@@ -690,7 +722,7 @@ is the header of the packet.
 @end menu


-@node ok packet, error packet, basic packets, basic packets, basic packets
+@node ok packet, error packet, basic packets, basic packets
 @subsection OK Packet

 For details, see @file{sql/net_pkg.cc::send_ok()}.
@@ -720,7 +752,7 @@ For details, see @file{sql/net_pkg.cc::send_ok()}.
 @end table


-@node error packet,  , ok packet, basic packets, basic packets
+@node error packet,  , ok packet, basic packets
 @subsection Error Packet	

 @example
@@ -835,7 +867,7 @@ For details, see @file{sql/net_pkg.cc::send_ok()}.
 		n data
 @end example
 
-@node fieldtype codes, protocol functions, communication
+@node fieldtype codes, protocol functions, communication, protocol
 @section Fieldtype Codes

 @example
@@ -859,7 +891,7 @@ Time            03 08 00 00     |01 0B                  |03 00 00 00
 Date            03 0A 00 00     |01 0A                  |03 00 00 00
 @end example

-@node protocol functions, protocol version 2, fieldtype codes
+@node protocol functions, protocol version 2, fieldtype codes, protocol
 @section Functions used to implement the protocol

 @c This should be merged with the above one and changed to texi format
@@ -971,7 +1003,7 @@ client. If this is equal to the new message the client sends to the
 server then the password is accepted.
 @end example

-@node protocol version 2, 4.1 protocol changes, protocol functions
+@node protocol version 2, 4.1 protocol changes, protocol functions, protocol
 @section Another description of the protocol

 @c This should be merged with the above one and changed to texi format.
@@ -1664,7 +1696,7 @@ fe 00                       . .
 @c @node 4.1 protocol,,,
 @c @chapter MySQL 4.1 protocol

-@node 4.1 protocol changes, 4.1 field packet, protocol version 2
+@node 4.1 protocol changes, 4.1 field packet, protocol version 2, protocol
 @section Changes to 4.0 protocol in 4.1

 All basic packet handling is identical to 4.0. When communication
@@ -1699,7 +1731,7 @@ results will sent as binary (low-byte-first).
 @end itemize


-@node 4.1 field packet, 4.1 field desc, 4.1 protocol changes
+@node 4.1 field packet, 4.1 field desc, 4.1 protocol changes, protocol
 @section 4.1 field description packet

 The field description packet is sent as a response to a query that
@@ -1719,7 +1751,7 @@ uses this to send the number of rows in the table)
 This packet is always followed by a field description set.
 @xref{4.1 field desc}.

-@node 4.1 field desc, 4.1 ok packet, 4.1 field packet
+@node 4.1 field desc, 4.1 ok packet, 4.1 field packet, protocol
 @section 4.1 field description result set

 The field description result set contains the meta info for a result set.
@@ -1737,7 +1769,7 @@ The field description result set contains the meta info for a result set.
 @end multitable


-@node 4.1 ok packet, 4.1 end packet, 4.1 field desc
+@node 4.1 ok packet, 4.1 end packet, 4.1 field desc, protocol
 @section 4.1 ok packet

 The ok packet is the first that is sent as an response for a query
@@ -1763,7 +1795,7 @@ The message is optional.  For example for multi line INSERT it
 contains a string for how many rows was inserted / deleted.


-@node 4.1 end packet, 4.1 error packet, 4.1 ok packet
+@node 4.1 end packet, 4.1 error packet, 4.1 ok packet, protocol
 @section 4.1 end packet

 The end packet is sent as the last packet for
@@ -1792,7 +1824,7 @@ by checking the packet length < 9 bytes (in which case it's and end
 packet).


-@node 4.1 error packet, 4.1 prep init, 4.1 end packet
+@node 4.1 error packet, 4.1 prep init, 4.1 end packet, protocol
 @section 4.1 error packet.

 The error packet is sent when something goes wrong.
@@ -1809,7 +1841,7 @@ The client/server protocol is designed in such a way that a packet
 can only start with 255 if it's an error packet.


-@node 4.1 prep init, 4.1 long data, 4.1 error packet
+@node 4.1 prep init, 4.1 long data, 4.1 error packet, protocol
 @section 4.1 prepared statement init packet

 This is the return packet when one sends a query with the COM_PREPARE
@@ -1843,7 +1875,7 @@ prepared statement will contain a result set. In this case the packet
 is followed by a field description result set. @xref{4.1 field desc}.


-@node 4.1 long data, 4.1 execute, 4.1 prep init
+@node 4.1 long data, 4.1 execute, 4.1 prep init, protocol
 @section 4.1 long data handling

 This is used by mysql_send_long_data() to set any parameter to a string
@@ -1870,7 +1902,7 @@ The server will NOT send an @code{ok} or @code{error} packet in
 responce for this.  If there is any errors (like to big string), one
 will get the error when calling execute.

-@node 4.1 execute, 4.1 binary result, 4.1 long data
+@node 4.1 execute, 4.1 binary result, 4.1 long data, protocol
 @section 4.1 execute

 On execute we send all parameters to the server in a COM_EXECUTE
@@ -1908,7 +1940,7 @@ The parameters are stored the following ways:
 The result for this will be either an ok packet or a binary result
 set.

-@node 4.1 binary result, , 4.1 execute
+@node 4.1 binary result,  , 4.1 execute, protocol
 @section 4.1 binary result set

 A binary result are sent the following way.
@@ -2384,7 +2416,7 @@ work for different record formats are: /myisam/mi_statrec.c,
 /myisam/mi_dynrec.c, and /myisam/mi_packrec.c.
 @*

-@node InnoDB Record Structure,InnoDB Page Structure,MyISAM Record Structure,Top
+@node InnoDB Record Structure, InnoDB Page Structure, MyISAM Record Structure, Top
 @chapter InnoDB Record Structure

 This page contains:
@@ -2690,7 +2722,7 @@ shorter because the NULLs take no space.
 The most relevant InnoDB source-code files are rem0rec.c, rem0rec.ic,
 and rem0rec.h in the rem ("Record Manager") directory.

-@node InnoDB Page Structure,Files in MySQL Sources,InnoDB Record Structure,Top
+@node InnoDB Page Structure, Files in MySQL Sources, InnoDB Record Structure, Top
 @chapter InnoDB Page Structure

 InnoDB stores all records inside a fixed-size unit which is commonly called a
@@ -3121,7 +3153,7 @@ header.
 The most relevant InnoDB source-code files are page0page.c,
 page0page.ic, and page0page.h in \page directory.

-@node Files in MySQL Sources,Files in InnoDB Sources,InnoDB Page Structure,Top
+@node Files in MySQL Sources, Files in InnoDB Sources, InnoDB Page Structure, Top
 @chapter Annotated List Of Files in the MySQL Source Code Distribution

 This is a description of the files that you get when you download the
@@ -4942,7 +4974,7 @@ The MySQL program that uses zlib is \mysys\my_compress.c. The use is
 for packet compression. The client sends messages to the server which
 are compressed by zlib. See also: \sql\net_serv.cc.

-@node Files in InnoDB Sources,,Files in MySQL Sources,Top
+@node Files in InnoDB Sources,  , Files in MySQL Sources, Top
 @chapter Annotated List Of Files in the InnoDB Source Code Distribution

 ERRATUM BY HEIKKI TUURI (START)

--- a/VC++Files/mysql.dsw
+++ b/VC++Files/mysql.dsw
@@ -605,6 +605,9 @@ Package=<5>

 Package=<4>
 {{{
+    Begin Project Dependency
+    Project_Dep_Name strings
+    End Project Dependency
 }}}

 ###############################################################################