News Feed : planetmysql

On iostat, disk latency; iohist onward! -- Jeremy Cole

Just a little heads-up and a bit of MySQL-related technical content for all of you still out there following along…

At Proven Scaling, we take on MySQL performance problems pretty regularly, I’m often in need of good tools to characterize current performance and find any issues. In the database world, you’re really looking for a few things of interest related to I/O: throughput in bytes, requests, and latency. The typical tool to get this information on Linux is iostat. You would normally run it like iostat -dx 1 sda and its output would be something like this, repeating every 1 second:


Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 8.00 0.00 4.00 0.00 96.00 0.00 48.00 24.00 0.06 15.75 15.75 6.30

Most of the output of iostat is interesting and reasonable for its intended purpose, which is as a general purpose way to monitor I/O. The really interesting things for most database servers (especially those in trouble) are:

  • avgrq-sz — Average request size, in kilobytes.
  • avgqu-sz — Average I/O queue length, in requests.
  • await — Average waiting time (in queue and scheduler) before servicing a request, in milliseconds.
  • svctm — Average total service time for I/O requests, in milliseconds. This includes await, so should always be higher than await. This is the most interesting number for any write-heavy transactional database server, as it translates directly to transaction commit time.
  • %util — Approximate percent utilization for the device.

There are one major problem with using iostat to monitor MySQL/InnoDB servers: svctm and await combine reads and writes. With a reasonably configured InnoDB, on a server with RAID with a battery-backed write cache (BBWC), reads and writes will have very different behaviour. In general, with a non-filled cache, writes should complete (to the BBWC) in just about zero milliseconds. Reads should take approximately the theoretical average time possible on the underlying disk subsystem.

I’ve often times found myself scratching my head looking at a non-sensical svctm due to reads and writes being combined together. One day I was perplexed enough to do something about it: I opened up the code for iostat to see how it worked. It turns out that the core of what it does is quite simple (so much so, I wonder why it’s C instead of Perl) — it opens /proc/diskstats, and /proc/stat and does some magic to the contents.

What I really wanted is a histogram of the reads and writes (separately, please!) for the given device. I hacked up a quick script to do that, and noticed how incredibly useful it is. I recently had to extend it to address other customer needs, so I worked on it a bit more and now it looks pretty good. Here’s an example from a test machine (so not that realistic for a MySQL server):

util:   1.27% r_ios:     0  w_ios:     1  aveq:     0,
ms : r_svctm                     : w_svctm
 0 :                             :
 1 :                             :
 2 :                             :
 3 : x                           :
 4 : x                           :
 5 : xxx                         :
 6 : xxxx                        :
 7 :                             :
 8 : x                           : x
 9 : x                           : xx
10 : x                           : xxxxx
11 :                             : xxxxxxxxxxxxxxx
12 :                             : xxxxxxxxxxxxxxxxxxxxxxxxx
13 : xx                          : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
14 :                             : xxxxxxxxxxxxxxxxxxxxx
15 : xx                          : xxxxxxxxxxx
16 : x                           : xxxxx
17 : x                           : xxxxxx
18 :                             : xxxx
19 : x                           : xx
20 :                             : x
21 :                             : x
22 :                             : x
23 :                             : x
24 :                             : x
25 :                             :
26 :                             :
27 :                             :
28 : x                           :
29 :                             :
30 :                             :
++ : 0                           : 250

It uses Curses now to avoid redrawing the entire screen, and I’ve got a ton of ideas on how to improve it. I have a few more must-haves before I release it formally to the world, but I wonder what more features people would want from it. It is Linux-only for the foreseeable future.

What do you think?

MySQL Find 0.9.0 released -- Baron Schwartz (xaprb)

If you've used the UNIX find command for more than a trivial find-and-print, you know how powerful it is; it's almost a miniature programming environment to find and manipulate files and directories. What if you could do the same thing with MySQL tables and databases? That was the inspiration for writing this tool. I was about to write several other tools to do some MySQL administrative jobs when I realized I could generalize and make something much more useful and powerful.

Hacking MySQL: SIGNAL support (I) -- Jorge Bernal

I’ve been looking for an open source project to collaborate for some time now, and given the time I’m spending with MySQL lately and the expertise I’m gaining thanks to MySQL training, it looked like an obvious choice.

During the last advanced bootcamp, Tobias found bug #27894, which apparently was a simple fix. Dates in binlog were formatted as 736 instead of 070306 (for 2007-03-06). During the bootcamp I used my lonely nights at the hotel and came up with a patch, and some days later my first contribution was going into the main MySQL code.

The problem

Now I had to find something bigger. One of the things that most annoys me of MySQL is the lack of some way to abort a procedure or trigger: there is no raise method. To generate a custom error you have to do hacks like:

SELECT `

Error: Invalid firmware series for this model

` INTO dummy FROM model;

The solution

There is a SIGNAL command in the SQL:2003 standard which does the job, but it’s not implemented (yet) in MySQL. The syntax is as follows:

SIGNAL signal_value [ SET signal_information_list ] 

signal_value:
    condition_name
  | sqlstate_value

signal_information_list:
    [ signal_information_list , ] signal_information_item 

signal_information_item:
    condition_name = condition_value

condition_name:
    CLASS_ORIGIN
  | SUBCLASS_ORIGIN
  | CONSTRAINT_CATALOG
  | CONSTRAINT_SCHEMA
  | CONSTRAINT_NAME
  | CATALOG_NAME
  | SCHEMA_NAME
  | TABLE_NAME
  | COLUMN_NAME
  | CURSOR_NAME
  | MESSAGE_TEXT

In this first part I’ll cover the basics: just the SIGNAL command with a fixed generic error, enough to get rid of the dirty hacks.

The implementation

Getting used to foreign code always takes some level of difficulty, but when you have to deal with grammars and parsers it’s all crazy fun. First, we have to add a symbol for our new command

sql/lex.h

In this file, we have a symbols[] array where we have to add SIGNAL. Since it seems to be sorted in alphabetic order, we’ll put our line between SHUTDOWN and SIGNED:

   { "SHUTDOWN",    SYM(SHUTDOWN)},
   { “SIGNAL”,    SYM(SIGNAL_SYM)},
   { “SIGNED”,    SYM(SIGNED_SYM)},

sql/share/errmsg.txt

Before we get our hands dirty with the parser file, let’s get our custom error prepared. I took a look at the SQLSTATE error messages and I found the 38503 (Exception generated from user-defined function/procedure) enough related to this.

In this file we have a series of error constants with their corresponding error messages in various languages. Since our new error will be related to stored procedures, I decided to put with the rest of SP-related errors:

 ER_SP_CASE_NOT_FOUND 20000
         eng "Case not found for CASE statement"
         ger "Fall für CASE-Anweisung nicht gefunden"
 ER_SP_SIGNAL 38503
         eng “Exception generated from user-defined function/procedure”
 ER_FPARSER_TOO_BIG_FILE
         eng “Configuration file ‘%-.64s’ is too big”
         ger “Konfigurationsdatei ‘%-.64s’ ist zu groß”

sql/sql_yacc.yy

And finally to the point. Here we have to declare that we’ll be using the SIGNAL_SYM which we defined at sql/lex.h as a token.

 %token  SHUTDOWN
 %token  SIGNAL_SYM
 %token  SIGNED_SYM

Then, in the sp_proc_stmt label (look for sp_proc_stmt: at the beginning of a line), we add sp_proc_stmt_signal as another possibility (we’ll define this in a minute):

 	| sp_proc_stmt_iterate
 	| sp_proc_stmt_signal
 	| sp_proc_stmt_open

And finally, between the sp_proc_stmt_iterate and the sp_proc_stmt_open definition we add our code:

sp_proc_stmt_signal:
    SIGNAL_SYM
  	{
            LEX *lex= Lex;
	    sp_head *sp= lex->sphead;
	    sp_instr_error *i;

	    i= new sp_instr_error(sp->instructions(), lex->spcont, ER_SP_SIGNAL);
	    sp->add_instr(i);
	  }

This basically tells the parser to expect the SIGNAL_SYM token (SIGNAL) with no arguments, and generate an error with our new error code (ER_SP_SIGNAL). As you might see there’s some extra code which I copied directly from similar definitions, which I’ll refer to as parser magic (anyone willing to explain what sphead and lex variables are will be very welcome)

Conclusion

This one wasn’t so extremely difficult if you had some previous experience with Bison, but the next part can be more interesting, since I guess we’ll have to add some more functions than sp_instr_error to be able to show custom error messages. Also, we’ll have to prepare some test cases to verify our newly created behaviour.

I hope this helps someone trying to contribute to MySQL. If you want to try this at home you can follow the article or apply the patch

So Silly .... -- Peter Laursen

With the announcement of MySQL 6 it is no secret any longer that MySQL AB thinks the basic (transactional) storage engine should be FALCON in the not so far future.

Also The MySQL GUI tools support those now - and did for a while now.  I would say 'support as the rope supports the hanged' - because the implementations is plain silly.  Even when connected to MySQL 4.x you can specify FALCON as the default storage engine ... so silly. This is amateurish programming!   Nothing but that!  The newly released GUI tools r12(the first after the server 6 announcement and release) does not change that.

What is the idea with the pluggable storage engine architecture if FALCON is just hard-coded everywhere?  

On the contrary SQLyog gives you an option to select FALCON (and PBXT and SOLIDDB and everything else available) when it makes sense: when the storage engine is available, and only then.

Hacking MySQL table logs -- Giuseppe Maxia

Shortly before MySQL Users Conference I announced that I would be cover new ground in table logs management.
I am keeping that promise, and in addition I am also showing some related hacks.

The announced facts from last year usability report were that you can't change log tables at will, as you can do with log files, and you can't change the log table engine to FEDERATED. Both claims, as it turned out, were incorrect. You can do such things, albeit not in a straightforward manner. As a bonus side effect, you can also:
  • add triggers to log tables;
  • filter log tables depending on user defined criteria, such as query type, user database, or time;
  • centralize logs from several servers.


Read the rest of the article at O'Reilly Databases site

451 CAOS Links - 2007.05.08 -- The 451 Group

Sun advances OpenJDK with new code and governance board. Dell joins the Microsoft/Novell collaboration effort. GroundWork releases new version with SOA development framework. (and more)

Note: Due to an international flight and limited Internet access in transit, there was no 451 CAOS Links on Monday 05/07/07.

Sun Fulfills Promise of Open and Free Java Technology and Releases Java SE Platform to OpenJDK Community, Sun Microsystems (Press Release)

Dell Joins Microsoft and Novell Collaboration, Microsoft / Novell / Dell (Press Release)

Sun To Develop New Communications Application Server Through Open Source GlassFish Community, Sun Microsystems (Press Release)

New Version Of Groundwork Monitor Powers First Services-Oriented Architecture For Open Source It Management, GroundWork Open Source (Press Release)

Compiere Extends Product Flexibility and Deployment Ease with New Release, Compiere (Press Release)

MySQL AB to Offer Low-Cost, High Availability Solution for Business-Critical LAMP Applications, MySQL AB (Press Release)

Jitterbit Announces Major New Additions for 1.2, Jitterbit (Press Release)

Terracotta Announces Integration of Terracotta Clustering Solution With GlassFish Application Server, Terracotta (Press Release)

Ericsson to contribute to open source application server and provide IMS support for developer communities, Ericsson (Press Release)

CCID Consulting: China’s Linux Market up by 30.9% in 2007Q1, CCID Consulting (Press Release)

Coupa Raises the Bar on Purchasing Simplified with Launch of Coupa eProcurement Enterprise, Coupa (Press Release)

SplendidCRM Announces Launch of SplendidCRM v1.4 Incorporating Key .NET 2.0 Technologies, SplendidCRM (Press Release)

Xandros Releases Xandros Server Standard Edition 2, Offering Enterprise-Grade Features to Windows-Centric SMBs, Xandros (Press Release)

Palamida Extends IP Amplifier’s Day One Reporting of Open Source Intellectual Property and Security Risks, Palamida (Press Release)

Ubuntu plans mobile Linux version, InfoWorld, John Blau (Article)

7 Challenges Red Hat Faces to Remain Open Source’s Finest, SeekingAlpha, Joe Panettieri (Article)

How 12 people banded together to make Firefox 1.0, APC Magazine, Dan Warne (Article)

Open Source Venture Investment Through Q1 of 2007, Larry Augustin’s Weblog, Larry Augustin (Blog)

Will the last top developer out of Novell please turn off the lights?, ZDNet Open Source Blog, Dana Blankenhorn (Blog)

A free software milestone, here be dragons, Mark Shuttleworth (Blog)

Look who is coming to OSBC -- InfoWorld

I just went through the attendee list for the Open Source Business Conference and loved what I saw. For the first time, OSBC is truly drawing a deep bench of IT buyers. It's something that we have strived for since the first show (well, the second, since the first show was intended to be a vendor strategy event), and which has finally happened. We have CIOs/VPs/Directors from the following companies (and I won't even bother to go into all the CXOs/VPs we have from Red Hat, MySQL, Alfresco, Microsoft, MuleSource, JasperSoft, SugarCRM, OpenBravo, Loopfuse, Zmanda, XenSource, etc. etc. - this... READ MORE

New betas of XAMPP for Linux and Windows -- Kai Seidler

And again we're on our mission to keep XAMPP up-to-date and put the first beta version of the upcoming XAMPP release in our public beta download area.

In this beta we updated both PHP versions (to 4.4.7 and 5.5.2) and phpMyAdmin (to 2.10.1). In the Windows beta we also fixed the security vulnerability published April 28th.

Get the downloads at XAMPP BETA.

XAMPP beta versions are always for testing purposes only. There will be no upgrade packages from and to beta versions. To all testers: Many thanks in advance!!

Software Freedom Law Center -- Kaj Arnö

MySQL is indebted to the Software Freedom Law Center for very good advice and insight on how to combine Free Software with a viable business model. SFLC provides legal representation and other law-related services to protect and advance Free and Open Source Software. Founded in 2005, the Center now represents many of the most important and well-established free software and open source projects.

Professor Eben Moglen, SFLC director and FSF legal counsel, has provided us with profound guidance over the years. We have tried to give something back through our work in the GPLv3 Committee B, but our time resources as a small company are limited in comparison to our fellow committee members.

In recognition of Eben’s help and as a token of our appreciation, we’ve made a small donation to support the continued work of the SFLC. We encourage others who build their business on free software to do the same.

Thank you, Eben!

Yahoo! Pipes - the Edwin Pipe in under 15 minutes -- Colin Charles

At the MySQL Conference the closing keynote was on Yahoo! Pipes, by Pasha Sadri, a Principal Software Engineer, Advanced Development Division, Yahoo!. I wanted to try it, but I was on Firefox 1.5 on Fedora Core 6 and there was no way I was going to build a pipe during the talk.

Fast forward a week or so later, and a boring Friday night ensued. What better thing to do, than to play with Pipes. In under fifteen minutes, I created the Edwin Pipe. What is it? Its a pipe that is all things MySQL - comprehensive source of news, whats cool, and so forth. There are some limitations - regular expression support is supposedly like Perl’s, but is not quite complete. The Unique operator is pretty cool, filtering is good (can be improved with better regex support), and maybe some sort of fuzzyness in the way data is displayed (I don’t only want all Digg mysql related items popping up at the top, or I don’t only want all mysql job forum details at the bottom, etc.). Language conversion via a Babelfish operator exists, but not language filtering (maybe I only want all English text displayed in my final pipe output).

That aside, the forums are pretty active. Pipes are ridiculously easy to create. Its simply great stuff. Oh, shorter URLs - the URLs are so long and not feasible, in my opinion. Impressive is the support to then get RSS output, and also JSON (so all processing is done on the server side). Happy I am with sites that provide JSON feeds.

Now, for some notes I took during the closing keynote.

  • A while ago, he wanted to find an apartment near a park. Go to Craigslist and find apartment lists, then click the map link, and also check distance to a park on the map… This is tedious, and not automated.
  • Craigslist apartment RSS feed. Yahoo! Local API to find Parks. Why not tie this in together? It started with about 50 lines of Perl code, and it combined feeds + web services (this is your Web 2.0 mashup).
  • Pipes: free online service that lets you remix data and create mashups using a visual editor.
  • Pipes treats the web as a big database, as they do joins across different ‘tables’.
  • Design principles came from the Unix Pipes. They’re like pipes for the Web. Build useful applications from simple primitives.
  • The more open Pipes is, the more useful it will be (so Google goodness will also work).
  • Output available in JSON, so it can be used as another application. Get email or SMS from output, even. RSS is obviously available.
  • App Examples: Last.fm + Flickr, Babbler (Second Life, language translation) by Max Case.
  • Must enable users to solve ad-hoc problems. User generated “features” and disposable applications -> the future.
  • Pipes uses MySQL, squid for caching, PHP & Perl (lots of CPAN modules) for serving and back-end processing of the pipes.

Edwin 2.0 is already in the works. It will have more cool feeds, and probably work out all the language issues with more separated regexes. More fuzzy organizing of data, if possible. If you want to see a MySQL Blogger Photo Gallery, bman_seattle created a pipe too.

Technorati Tags: , , , , , , ,

MySQL Federated ODBC - Hello PostgreSQL! -- Patrick Galbraith

Multiple Data sources in action! First, MySQL:

mysql> show plugins;
+----------------+--------+----------------+-----------------------------+---------+
| Name | Status | Type | Library | License |
+----------------+--------+----------------+-----------------------------+---------+
| binlog | ACTIVE | STORAGE ENGINE | NULL | GPL |
| partition | ACTIVE | STORAGE ENGINE | NULL | GPL |
| ARCHIVE | ACTIVE | STORAGE ENGINE | NULL | GPL |
| BLACKHOLE | ACTIVE | STORAGE ENGINE | NULL | GPL |
| CSV | ACTIVE | STORAGE ENGINE | NULL | GPL |
| FEDERATED | ACTIVE | STORAGE ENGINE | NULL | GPL |
| MEMORY | ACTIVE | STORAGE ENGINE | NULL | GPL |
| InnoDB | ACTIVE | STORAGE ENGINE | NULL | GPL |
| MyISAM | ACTIVE | STORAGE ENGINE | NULL | GPL |
| MRG_MYISAM | ACTIVE | STORAGE ENGINE | NULL | GPL |
| ndbcluster | ACTIVE | STORAGE ENGINE | NULL | GPL |
| FEDERATED_ODBC | ACTIVE | STORAGE ENGINE | libfederated_odbc_engine.so | GPL |
+----------------+--------+----------------+-----------------------------+---------+
12 rows in set (0.01 sec)

mysql> show create table t5;
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t5 | CREATE TABLE `t5` (
`a` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`a`)
) ENGINE=FEDERATED_ODBC DEFAULT CHARSET=latin1 CONNECTION='obdc://Driver=myodbc3;Server=localhost;Database=federated_odbc;Port=5555;socket=/tmp/mysql-5555.sock;Option=3;UID=root:t2' |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select * from t5;
+---+
| a |
+---+
| 1 |
| 2 |
+---+
2 rows in set (0.00 sec)


let's create a nice little postgres table:

patg=> create table t1 (a int);
CREATE TABLE
patg=> insert into t1 values (1);
INSERT 0 1
patg=> insert into t1 values (22);
INSERT 0 1
patg=> insert into t1 values (333);
INSERT 0 1
patg=> select * from t1;
a
-----
1
22
333
(3 rows)

patg=>

Let us now create a Federated ODBC table to this nice little postgres table:

mysql> create table t1pg (a int(11)) ENGINE=FEDERATED_ODBC CONNECTION='odbc://Driver=postgresql;Server=localhost;Database=patg;Port=5432;Option=3;UID=patg;PASSWD=patg:t1';
Query OK, 0 rows affected (0.20 sec)

mysql> select * from t1pg;
+------+
| a |
+------+
| 1 |
| 22 |
| 333 |
+------+
3 rows in set (0.02 sec)

Very nice indeed!

One of the biggest headaches has been realising that backtick "`" characters to "quote" column or table names do NOT work on postgres. Neither does 'show table status' in ::info. There are other syntax errors to deal with as well, and I have to play with postgres's logging verbosity to see what it means when it says there has been a syntax error.Another question I have that I might've stumbled onto is that the connection handle I create is a global class handle and not specific to the share. So if I connect to
t5 and t5 happens a MySQL DSN (DSN/connection), then I access t1, which _should_ be a postgres DSN (DSN/connection), then I don't see that working all that well. So, what about each share having a connection? Also, how does one do functions like ::info? Not every database will give me information like 'show table status' gives me in MySQL.

A comment to a previous post indicated I would have to make sure all SQL is ANSI, and I think there's going to be a bit of thinking, work, to get this to be ANSI as possible but have it work nicely.

MySQL AB to Offer Low-Cost, High Availability Solution for Business-Critical LAMP Applications

MySQL AB today announced a joint partnership and services agreement with LINBIT, the well-respected Austrian provider of high availability Linux systems technology called DRBD. Through its MySQL Enterprise subscription offering, MySQL AB will now offer direct support for this proven, low-cost solution for attaining ?Four Nines? and greater uptime for transactional database applications in LAMP (Linux, Apache, MySQL, PHP/Perl/Python) computing environments.

DTrace and MySQL - 1 -- Frank Mash

With this post, I am starting another series of blog posts that will help you become familiar of DTrace. You can then apply that knowledge to find all the hidden performance goodies of MySQL on Solaris 10. Sounds good?

DTrace is one of those tools that the more you use it, the more you fall in love with it. To be fair, it is much more than a tool, in fact it has its own language, D.

With DTrace you can enable probes by either their name or their number. To see a list of probes available, run
[root@db31:/] dtrace -l | more
ID PROVIDER MODULE FUNCTION NAME
1 dtrace BEGIN
2 dtrace END
3 dtrace ERROR
4 syscall nosys entry
5 syscall nosys return
6 syscall rexit entry
7 syscall rexit return
8 syscall forkall entry
9 syscall forkall return
10 syscall read entry
11 syscall read return
12 syscall write entry
13 syscall write return
14 syscall open entry
15 syscall open return
16 syscall close entry
17 syscall close return
18 syscall wait entry
19 syscall wait return
20 syscall creat entry
21 syscall creat return
...
.

To enable a basic probe named BEGIN with the id of 1, we can either use:

[root@db31:/] dtrace -n BEGIN
dtrace: description 'BEGIN' matched 1 probe
CPU ID FUNCTION:NAME
1 1 :BEGIN
^C
or
[root@db31:/] dtrace -i 1
dtrace: description '1' matched 1 probe
CPU ID FUNCTION:NAME
0 1 :BEGIN
^C

In addition, to enabling individual probes, we can also enable multiple probes simply be specifying them on the command line. For instance:
[root@db31:/] dtrace -i 1 -i 2 -i 3
dtrace: description '1' matched 1 probe
dtrace: description '2' matched 1 probe
dtrace: description '3' matched 1 probe
CPU ID FUNCTION:NAME
0 1 :BEGIN
^C
0 2 :END

or
[root@db31:/] dtrace -n BEGIN -n END -n ERROR
dtrace: description 'BEGIN' matched 1 probe
dtrace: description 'END' matched 1 probe
dtrace: description 'ERROR' matched 1 probe
CPU ID FUNCTION:NAME
1 1 :BEGIN
^C
0 2 :END

If you try to enable a probe that is not valid, you will get an error like this:

[root@db31:/] dtrace -n BEGIN -n END -n ERRORS
dtrace: invalid probe specifier ERRORS: probe description :::ERRORS does not match any probes

For a complete and in-depth coverage of DTrace, make sure you check out Solaris Dynamic Tracing Guide. For those looking to dive into examples right away, checkout the /usr/demo/dtrace directory on your Solaris machine.

to be continued...

Joe Celko Giving MySQL Training? -- Mike Hillyer

Wow, this is really neat. Joe Celko, author of various SQL books including my beloved Trees and Hierarchies book, is going to provide virtual training courses on DB Design with MySQL.

There’s not too much detail at the link, but this would be an excellent course to attend by a real SQL master.

It’s also a nice turnaround to see MySQL go from being dismissed by Celko for lack of standards compliance (search MySQL, see page 98) to using the same PL/PSM stored procedures as his books and having him train others on design with MySQL.

Time to save my nickels, this will be a course that shouldn’t be missed!

Barcamp Brussels 3 -- Kris Buytaert

The 3rd Barcamp Brussels is over, I was a bit dissapointed over the target audience of the first Barcamp, missed the second one, but the 3rd one was right where I expected it to be A healthy mix of technology, startup projects and the social aspect of technology.

I took my Tuxdroid with me, planned on having him tell the audience what they were twittering about, the idea was to take the logfile of my twitter jabber session and feed that to the text to speech deamon into Tux. I failed due to the lack of network hence incoming twitter stream, however the Tuxdroid proudly whacked its wings at the end of each talk :)

So about those talks :)

After Peter gave the startshot for the fight for a slot to talk I went to see the talk about gegis by Dirk Frigne. I totally didn't know there was an OpenSource project written for the Belgian governement which was into Gis .. cool stuff to see.

Frank then was next with his talk about OpenID, I really should spend time with openid, hmm.. I kind of think I`m repeating myselve :) Then Bernard continued with his talk on Performance of webb applications which off course included a large part about Databases.. I need to delete lots of stuff from my presentation :)

John introduced us to NotSoso, now if only I could find some spare minutes to actually look into NotSoso, I guess that that will be for later this week, or even after the weekend.

Then we had a 2 hour break for lunch and a groupfoto by Steven

Maarten did a talk about the Blogoloog and how he started out with an old Laptop with Fedora Core and now evolved into something more suitable , and Peter explained us about Belgian Tax Law .

Then some people found me interresting enough to come and listen to my talk about MySQL Cluster, I actually had taken the talk I had been preparing for the MySQL Symposium that should have taken place in Amsterdam earlier this year and got cancelled. That talk supposed to be over 2 hours. So I kind of cut over 70% of the slides I had tried fitting them into a 20 minute talk. I kind of "managed" :)

Bart gave an interresting talk about your online reputation, how to screw it and how to protect it. An interresting topic came up, should you poison your blog to attract a different audience ? Because according to him your peers are probably not your customers and you should try to get new potential customers to know you . So with that in mind.. what should I start blogging about ?

Clo and Bart Becks held a "Powerpoint Karaoke" on how to create a Sect, Pietel managed to take pictures that suggested that everybody actually joined the sect, that certainly wasn't true ! After It's in the details by Jesse and an even more interresting talk about what we could do with Freebase by Will I headed home exhausted, but satisfied.

I`m sure I missed some other interresting talks like the one from Toon Vanagt about TransferItOnline but other events will come where I can meet and talk to those people again !

Building a Storage Engine: Writing Data -- Brian Aker

While there is quite a bit that can be done with read only engines, writing data is entirely more fun :)

For our next lesson we are going to be doing exactly that. Unless you are writing a blob store only engine you will need to deal with fields, aka the columns you declare with a create table. Placing these fields into some sort of format is required. Different engines implement different written forms to disk. Most transactional engines work in block formats, while stream designs are common for engines which need high write performance.

There is no one right way to implement storage on disk, every design has a trade off.

Let us look at XML. It is slow to parse and slow to read. XML though is adored by many because it is a simple format that can be ready by many applications.

For our example we are going to put together an XML engine, and to that end we will use the libxml2 library for writing a XML storage engine.

We will implement a very simple XML schema based upon the schema outputted by the mysqldump application.

Implementing INSERT for MySQL means implementing the write_row() method. The example engine will also implement the optional start_bulk_insert() and end_bulk_insert() methods. We won't worry about concurrent reads and writes just yet, so we will leave in place table level locks.

To store the XML, we will update the "share" with a filename that we will use for reading and writing. The share concept is found throughout almost all of MySQL's engines. The folks at Nitro Security have made available an OOP version of it which you can find on MySQL Forge, but for our example we will stick with the common C structure form.

The idea for the SHARE is simple, when multiple handlers need to share the same resource to communicate these needs they use a central piece of shared memory. A great number of engines support this through the get_share() and free_share() methods. These methods are not a part of the handler class, but instead are just common naming conventions. Each call to get_share either inserts a new SHARE representing the shared memory into a hash with a key against the table name, or increments a use count integer in the SHARE.

For our needs we have extended the skeleton share to hold a character string to the path of the filename, data_file_name, that we will use.


typedef struct st_skeleton_share {
char *table_name;
char data_file_name[FN_REFLEN];
uint table_name_length,use_count;
pthread_mutex_t mutex;
THR_LOCK lock;
} SKELETON_SHARE;


Then we have updated get_share() to store the path to the file in data_file_name. The mysys function fn_format() will make sure that the path is correctly set.



fn_format(share->data_file_name, table_name, "", ".XML",
MY_REPLACE_EXT|MY_UNPACK_FILENAME);



While end_bulk_insert() only gives you a chance to cleanup after a bulk insert, which we will use to write out the XML file, start_bulk_insert gives you an estimate of the number of rows that will be inserted. We will use it to create a memory container for our XML that we will write to in write_row().


void ha_skeleton::start_bulk_insert(ha_rows rows)
{
DBUG_ENTER("ha_skeleton::start_bulk_insert");

xmlbuf= xmlBufferCreate();
writer= xmlNewTextWriterMemory(xmlbuf, 0);

xmlTextWriterStartDocument(writer, NULL, MY_ENCODING, NULL);
xmlTextWriterStartElement(writer, BAD_CAST "TABLE");
xmlTextWriterSetIndent(writer, 1);

DBUG_VOID_RETURN;
}


The write_row() method passes in one parameter, which is the raw row in its memory, aka UNIREG, format. While some engines do directly operate on this, it is considered best practice to not do this. We use the Field objects to write data into the XML file. Our XML engine supports nulls by place empty FIELD tags into the xml file.


int ha_skeleton::write_row(byte * buf)
{
DBUG_ENTER("ha_skeleton::write_row");

char content_buffer[1024];
String content(content_buffer, sizeof(content_buffer),
&my_charset_bin);
content.length(0);

xmlTextWriterStartElement(writer, BAD_CAST "ROW");
xmlTextWriterSetIndent(writer, 2);

for (Field **field=table->field ; *field ; field++)
{
if ((*field)->is_null())
{
xmlTextWriterStartElement(writer, BAD_CAST "FIELD");
xmlTextWriterEndElement(writer);
}
else
{
(*field)->val_str(&content);
xmlTextWriterWriteElement(writer, BAD_CAST "FIELD",
BAD_CAST content.c_ptr_safe());
}
}
xmlTextWriterEndElement(writer);

DBUG_RETURN(0);
}


Now finally we use end_bulk_insert() to actually write the XML file to disk:


int ha_skeleton::end_bulk_insert()
{
File writer_fd;
DBUG_ENTER("ha_skeleton::end_bulk_insert");

xmlTextWriterEndDocument(writer);
xmlFreeTextWriter(writer);

writer_fd= my_open(share->data_file_name, O_WRONLY|O_CREAT, MYF(0));
my_write(writer_fd, (byte*)xmlbuf->content, xmlbuf->use, MYF(0));
my_close(writer_fd, MYF(0));

xmlBufferFree(xmlbuf);

DBUG_RETURN(0);
}


Now we have written some XML to disk!

If you look at the chapter03 version of the skeleton engine you will also find that I have updated rnd_next() to now read the XML file.

http://hg.tangent.org/writing_engines_for_mysql

So what could be done to extend this? For one it does not convert all of the data being passed into the XML file to UTF8. Also, this interface assumes bulk insert, and it should be extended to append to the XML file, not over write it. There is also very little that has been done to protect against a corrupted XML file.

The previous entries in this series:
Getting The Skeleton to Compile
Reading Data

MySQL conference ending thoughts and presentation files -- Frank Mash

Man, I can't believe it's been over a week since I returned from the very great and exciting MySQL conference 2007. I got to meet all my old and new friends. Big Kudos to Jay Pipes and all MySQL'ers who helped make this event possible.

To me, this year's conference was the best ever. Partly because I made the very wise decision of staying at Hyatt so I won't miss a lot. At conferences like these the more you mingle with people, the more you get out of it. I had some amazing conversations with Mark Atwood, Brian Aker, Jay Pipes (I will never forget :)), Jeremy Cole, Eric Bergen, Pascal (Yahoo! France), Beat Vontobel, Markus Popp, Boel (MySQL HR), Christine Fortier, Marc Simony, Govi (Amazon), the "R"s of MySQL (Ronald and Ronald), Sheeri Kritzer, Carsten (certification), Ken Jacobs (Oracle), Kaj Arno, Dean (MySQL Support head), Domas, Kerry Ancheta (MySQL sales guru :)), Baron, Paul Tuckfield, Don MacAskill, Tobias, Peter Zaitsev, Chip Turner, Mark Callaghan and many more cool people. Thank you, everyone.

The sad part is that there wasn't enough time for me to hang out with people as much as I wanted. Oh well, MySQL Camp II is just around the corner in NYC.

I would also never forget the night I went to Denny's and had a Jalapeno burger with Michelle (wife), Jeremy, Adrienne (Mrs. Cole), Liam, Eric, Ronald, Domas and Pascal. It was so much fun.

At the conference, I also got a chance to be a part of MySQL Certification Technical Advisory Board (thanks to Carsten and Roland for having me). There were some excellent ideas and important issues discussed there. Everyone, including Mike Kruckenberg, Collins Charles and Sheeri contributed some excellent suggestions.

The presentation files for my sessions are now available at http://www.mysqlconf.com. Thank you to all those who written me repeatedly and kept reminding me about putting the slides online. I really appreciate your patience. The slides do not make up for the talk so if you find yourself with a question, please feel free to shoot an email. You can find my email address in the header of my blog.

Also, big thanks to Warren Habib, my boss, who was there to provide his support.

- MySQL and Lucene
- Fotolog: Scaling the world's largest photo blogging community (We have now crossed 100 million page views a day and are ranked as the 24th most visited site on the Internet by Alexa.)

For all those who have sent me an email, please bear with me as I will be sending a personal reply to everyone.

connector/odbc 3.51.15 -- Jim Winstead

this time it only took two months since the last release ? mysql connector/odbc 3.51.15 is now available. there aren?t a lot of bugs fixed in this release, compared to the 150 or so open bugs, but it is nice when you get to close a bug that is nearly three years old.

i?m not sure when the next release will happen, but i already have one patch pending.

How alter table locks tables and handles transactions -- Eric Bergen

I’ve talked to several people that have questions about how alter table works under the hood. They want to know how it handles locking tables why they can sometimes use a table during alter table and other times they can’t. Also why it’s so slow :)

First let’s look at the basic process alter table typically goes through.

  1. If a transaction is open on this thread, commit it.
  2. Acquire a read lock for the table.
  3. Make a temporary table with new structure
  4. Copy the old table to the temporary table row by row changing the structure of the rows on the fly.
  5. Rename the original table out of the way
  6. Rename the temporary table to the original table name.
  7. Drop the original table.
  8. Release the read lock.

The slowest part of the process is copying rows from the original table to the temporary table. For large tables this can take minutes to hours. There are a few optimizations built into this process. If the alter table query only renames the table then mysql doesn’t bother copying all the rows to a temporary table and just renames it. For most other things such as renaming columns, adding/dropping indexes, making columns nullable, changing the column default all require copying the entire table.

During the first 4 steps MySQL allows other clients to read from the table being altered. When alter table is done copying rows to the temporary table and is ready to rename it it changes the table lock. MySQL instructs all other clients currently reading from the table to close the table when they are done. While alter table is waiting for existing clients to finish reading from the table it prevents other clients from starting to read from the table. During this time selects will be blocked on “Waiting for tables”. When the last client is done reading from the table alter table continues renaming the table.

This has some interesting implications for transactions and repeatable reads. Internally innodb keeps track of when rows are created. When a transaction is started it can only see rows that were created before the transaction was started (using repeatable read). Any rows created after are not returned. Since alter table copied rows from the old table to the new table rows get a new version number as they are inserted into the temporary table. An alter table can cause a dirty read in transactions that span an alter table. Transactions started before alter table will get no rows back from the table after alter table is finished. If your application is sensitive to dirty reads or getting no rows back from a table (really dirty read :) ) then don’t run alter table on a server when clients are running.

Here is an example.

mysql a> alter ignore table t add unique index (t);

mysql b> begin;
Query OK, 0 rows affected (0.00 sec)

#This select is from the original table while alter table is copying rows
mysql> select * from t limit 10;
+——+
| t |
+——+
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
+——+
10 rows in set (0.00 sec)
#alter table finishes
#Rows created in the temporary table before we issued begin
mysql> select * from t limit 10;
+———-+
| t |
+———-+
| 10 |
| 6920631 |
| 27998430 |
| 41865298 |
| 49403894 |
+———-+
5 rows in set (0.00 sec)

#Get a new view of the table
mysql> commit;
Query OK, 0 rows affected (0.00 sec)

#This select returns all the rows.
mysql> select * from t limit 10;
+———–+
| t |
+———–+
| 10 |
| 6920631 |
| 27998430 |
| 41865298 |
| 49403894 |
| 50522347 |
| 84441015 |
| 109401269 |
| 110202688 |
| 123590778 |
+———–+
10 rows in set (0.00 sec)

If we start the transaction before alter table the select after alter table has finished will return no rows even though there are rows in the table. I’m not sure why innodb allows rename of a table when transactions still have a few of that table open. It seems like a bug to me.

Writing Supportable Software -- Mark Matthews

Yes, I know there are a lot of software development "top n" lists out there, and many dealing with writing "-[i/a]ble" software (extensible, maintainable, saleable, etc), but personally I wanted to get these concepts that have been rattling around in my brain for sometime down somewhere and hopefully start a discussion around them. Even though this isn't a wiki, I'll probably end up treating this like a living document myself.

For a long time (in software years), I've been maintaining the code base for the JDBC driver for MySQL. I also work very closely with the support team on various connectivity-related issues (JDBC, ODBC, ADO.Net, P-this-or-that, etc.) that our customers have, and have had the (mis)fortune of debugging all kinds of problems in various stacks over the years.

Sometimes we (MySQL) make debugging more of a problem than it needs to be, and sometimes it's just inherent problems in the "stack". Given that I have a Java (or at least VM-based language) "bent", maybe not all of these concepts are workable at a technical level, but I'd like to at least see that on my team we adhere to as many of them as possible.

  • Keep as low a bug count as possible - this is common-sense, keeping the bug count low (through quality design, maintenence, frequent releases, etc.) makes sure that your users aren't running into stupid issues that waste their time (and yours, in diagnosing the issues and working with the user on issues that really just shouldn't happen).

  • Grow empathy, not contempt for your users by being a user, rather than just acting like one (or relying on your QA team to act like one) - I'm a firm believer of eating one's own dogfood. I run MySQL "internally" (i.e. personally) for a lot of different things, and am always looking for more opportunites to use the software I work on for day-to-day tasks.

    If you use your own software for day-to-day things (especially if it helps you run the photo gallery that the family all looks at, helps with monitoring your network, and keeps the e-mail flowing), you'll run into similar issues that your users will, and end up creating an empathy for your users that is much better than the contempt for them that I have felt and seen from developers of various products.

    In my opinion, it's time to "throw in the towel" if you get to the point where as a developer you start to have contempt for your users and the issues they have with your software. If it weren't for the users, why would the software you work on exist in the first place?

  • Stay close to your users - blog (yes, I could be better about this!), answer questions on the forums, participate in IRC, etc. Make sure that you help out the "newbies" as well as the "old hands with tough problems", because they span the gamut of user types your software will have, and will bring more than one viewpoint to how easy to use (and thus supportable) your software is.

  • Stay close to the sales and support teams (if you're working in a commercial software house) - I think you can do no wrong by staying close to the sales and support teams. Both teams have the information you need that you might not always get from the community, and you'll get a lot of real-world, grounded feedback about what's working and what's not from issues that existing and prospective customers are having. You'll also hear about the extra work your software is causing (or preventing) by how supportable it is through your interactions with these teams.

  • "Never ask a customer to re-crash the car" - this is lifted from Elliot Murphy's blog, but I wholeheartedly believe in what he writes - don't make users run a "special" build of your software to diagnose problems if you can at all avoid it. Either provide built-in diagnostic tools or make it possible to figure out what's going on (at some level) by the error information you provide, which leads me to....

  • Make error messages meaningful - any time your software encounters an error, and interacts with the user to inform them of it, you have an opportunity to show, as a developer how much you actually value the user's time.

    In my opinion, an error message should give as much of the story to the user as that tenet you learned about in elementary school: "Who, What, Where, When (Why, and perhaps "how" to resolve the issue)".

    Too often we just write up a string that barely describes the issue in our own terminology, which then just sends the user (or the person supporting them) into some search for what is really happening when that error message is encountered.

    Luckily, the "who" is usually assumed, the "where" is covered by a stack trace (at least in VM-based languages), and the "when" is covered by the application log. That leaves coming up with a good "what", "why" and "how" to the developer.

    One error message I'm proud of in Connector/J is the one that you get when the connection to the database is lost, or can't be made in the first place. The driver knows the various kinds of exceptions that can happen at connect-time (connection refused, bind refused, etc.) and how long it's been since the driver has (ever) communicated with the server. Based on this information it presents different kinds of error messages, and even points to solutions. For example, if there are no client-side ephemeral ports to bind to, the user gets an error message like this:

    "The driver was unable to create a connection due to an inability to establish the client portion of a socket.

    This is usually caused by a limit on the number of sockets imposed by the operating system.

    This limit is usually configurable. For Unix-based platforms, see the manual page for the 'ulimit' command. Kernel or system configuration may also be required. For Windows-based platforms, see Microsoft Knowledge Base Article 196271 (Q196271)."

    If the driver thinks that your connection has been idle for longer than the value "wait_timeout" on the server, you get an error message like this:

    "The last communications with the server was nnnn seconds ago, which is longer than the server configured value of 'wait_timeout'.

    You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem."

    There's a few more types of messages that you get depending on the state of things, but the underlying theme is that the software scrapes up what it knows about the current situation, posits why the situation might be happening, and even tries to offer suggestions of what to do to prevent the error from happening.

    Not to say that everything's perfect in Connector/J, I do have this very useful error message still lurking about:

    "General error"

    The general work of collecting stupid error messages has already been done, but needless to say, I've got work to do fixing some of the few bonehead ones left in our wares.

Okay, so that's six points, I hope to grow this to at least ten, but I've got this blog entry off of my "todo" list for now...

State of the Computer Book Market, Q107, part 1 - Overall Market -- Tim O'Reilly

By Mike Hendrickson

Tim has asked me to take over writing the quarterly report on the state of the computer book market.

As described in the post Computer Book Sales as a Technology Trend Indicator, our research group has built a MySQL datamart containing Bookscan's weekly top 10,000 titles sold. Bookscan measures actual cash register sales in bookstores to you, the individuals purchasing and reading the books. Retailers such as Borders, Barnes & Noble, and Amazon make up the lion's share of these sales.

Book Market Performance

Here's the year-on-year trend for the entire computer book market since 2003, when we first obtained the data from Bookscan.

Click on the image to get a larger view.
03 07 Bk Trendline-3



As you can see, the clear seasonal pattern we've pointed out before still exists. The trend line for each year closely mirrors the year before, with remarkably consistent weekly ups and downs. (The computer book market cratered in 2001, shrinking twenty percent a year for three years until it stabilized in 2004 at about half the size that it was in 2000. We only have data going back to 2003. If we had pre-dotcom bust data, it would all be, to borrow a saying from Al Gore, "off-the-charts.") 2005 saw a slight upturn from what we viewed as the bottom of the market in 2004. 2006 got off to an even stronger start, and it looked as though we were recovering from the post-dotcom-bust slump. But by the second quarter of last year, we'd reverted to the norm for the past three years.

In the first quarter of 2006, new interest in web development associated with Web 2.0 and strong performance of books on digital media applications like Photoshop helped to drive the market. In the first quarter of 2007, we hoped that the Microsoft Vista and Office 2007 releases would cause a similar sharp increase in our trend lines. That has not materialized, and in fact, you could say that Microsoft's new releases have not lived up to expectations yet, at least for book sales. I did say "yet" because there are signs that Vista is starting to pick up steam. But the fact is that without a significant bump from Vista and related Office products, the 2007 market has not performed at the 2006 level. I find it very interesting that the web and digital media had more of a market effect in 2006 than a huge, highly anticipated release of a new version of the world's most used consumer operating system and its office productivity suite has in 2007. It's one more sign of the waning of Microsoft's once fearsome market power.

Comparing Quarters

In order to see what is spiking and deflating the trend line above, we use our Treemap visualization tool. This tool helps us pick up on trends quickly, even when looking at thousands of books. It works like this.

The size of a square shows the market share and relative-size of a category, while the color shows the rate of change. Red is down, and green is up, with the intensity of the color representing the magnitude of the change. A new category like Vista, which did not exist last year and hence can't be compared, shows up as black. The following screenshot of our treemap shows gains and losses by category, comparing the first quarter of 2007 with the first quarter of 2006.

1st Quarter 2007 vs. 1st Quarter 2006
Click the image to get a larger view.

Category Q107 Q106-4

So what are all the boxes and colors telling us? The fact that there are far more red boxes than green tells us that the market slump is broad-based, and that new categories like Vista and Office 2007 aren't enough to offset the overall market decline. If you compare this graphic with the one in Tim's post from the first quarter last year, it is quite noticeable that there are fewer green boxes and the green bright spots are smaller than last year.

It's probably not entirely fair to pick on Vista -- Windows book sales are up 21% over the year before, and Vista titles have taken over all the bestseller slots. But Office 2007 has definitely been a non-factor. While the Office category per se (books on the entire suite) shows 39% growth year over year, Excel and Word books are down 7%, with Powerpoint, Outlook, and Project books even further in the red, leaving the entire office applications category down 21%.

And of course, with both Mac OS X and the Adobe suites (now including Macromedia) down while they await new software releases, there's not a lot of support from the digital media end of the market.

As a result, we have to look elsewhere for signs of strength. In the business applications group, we see Sharepoint continuing its importance, and Quickbooks becoming a staple growth area. We also see some first signs of CRM heating up as a publishing area.

In the Web design and development area, it's worth noting that Ruby on Rails has continued its blazing growth, but Ajax books have not. The decline of both PHP and ASP are striking. Flash and Dreamweaver are also down, but as noted above, awaiting the CS3 release. Flex is starting to show its muscle.

Turning to the software development space, Agile is a category that is growing and one to watch. It wasn't too long ago that the dominant software development subjects were UML and Extreme Programming. Those barely show up any more although Extreme Programming was, loosely speaking, a precursor to Agile. Something worth looking at is the size of the boxes and those changes.

Remember, the size of the box represents the size of the market in relation to all items on the treemap. Rails in the bottom middle was a small speck in last year's first quarter post. Now the size of the box fits the size of its name at least. And its market share is almost equal to SQL and has surpassed VBA, Perl and Python. Python is also experiencing good growth, just not at the blazing velocity of Ruby. I would expect by next year, we will see Ruby have an even larger share/square.

On the Microsoft side, we see a large increase in the category of .Net programming. We really need to re-do this category, as .Net has been deprecated as a term, and this group includes several distinct technologies that need categories of their own. The topics driving the growth in this category include the MCTS certification, WPF, and WCF.

As noted above, Photoshop and related Adobe products are declining as everyone awaits CS3's release. The interesting thing with this market is that it pretty much is a one-for-one replacement type of thing. Most professional artists who make money by using these tools need to replace them to keep competitive. We'll watch this and report back in the coming quarters. I expect this to be a huge growth category by the end of 2007.

A useful way to organize the trends is to identify areas that are High Growth Categories [Bright Green], Moderate Growth Categories [Dark Green to Black], Categories to Watch [all colors], and Down Categories [Red to Bright Red]. Most of these descriptions are self-explanatory except perhaps the Categories to Watch. Categories to Watch contains titles that are typically not susceptible to seasonal swings. Subjects that fit into our Computer and Society segment are included here as well as other subjects that are not typically related to a release cycle. Categories to Watch also include stalwarts like Excel [even though it is susceptible to revisions] that sell, and sell and sell. Photoshop is also in this category as we are watching the effect of a looming new release and what is happening to the existing books in the category. To me, the watch list is a bit subjective and more of an intuitive view than a dogmatic view.

The table below highlights and explains some of the data from the chart above. The Share column shows the total market share of that category, and the ROC column shows the Rate of Change. So, for example, you can see that Windows Desktop OS books represent 9% of the entire computer book market, and were up 20%.

Category ShareROCNotes
High Growth - Hot Topics
Windows Desktop9.0%20%Vista Release; Vista expectations not yet met
Microsoft Office 2.5%26%Office 2007 Release; continued sales of previous versions still selling at 80% of Office 2007
Web Design 1.6%22%"Web Usability" titles are driving this.
Microsoft Certification 1.3%16%New MCTS certificate; interest in MS dev tools
Programming Languages 1.1%18%Ruby on Rails (RoR) + Python driving the category
Moderate Growth
Web Programming 3.7%7%Web 2.0, Ajax and RoR contributing to growth
SQL Server 1.6%6%SQL Server & services (reporting, analysis, integration) leading category
Categories to Watch
Photoshop 6.3%(21%)Book market anticipating new version release
Computers & Society 5.6%(16%)Not a "need to have" category. Should rebound with books like Myths of Innovation
Excel 3.1%(13%)Down now but should increase because of Office 2007 release.
Down Categories - Not Hot Topics
Active Server Pages 1.0%(36%)JSP and ASP both down -- RoR likely cause.
Security 1.1%(41%)Most books geared towards sys admin security, nothing is really new in category.
Hardware 1.3%(42%)Diverse topic, Make Magazine skews the numbers as more buyers now subscribe

Part two of this series will give a closer look at the technologies within the categories. Part three will be about the Publishers, winners and losers. And part four will be some more analysis of Programming Languages.

Technorati Tags: , , , ,

Can you trust your backups? -- Frank Mash

It is a good practice to check the integrity of your backups from time to time. You don't want something like this happening to you :)
time gunzip  -f -v entrieva/db4/guestbook_M-070507.tar.gz
entrieva/db4/guestbook_M-070507.tar.gz:
gunzip: entrieva/db4/guestbook_M-070507.tar.gz: unexpected end of file

real 67m56.782s
user 48m40.006s
sys 14m14.584s

Solaris 10 Dual Boot Installation on a Laptop -- Frank Mash

Earlier, I blogged about how to install Solaris 10 on Mac book using Parallels. This post contains links to screen casts about installing Open Solaris on a laptop as dual boot.The screen casts are about 30 minutes in total. Special thanks to Laurent Bridenne of Sun Microsystems.

For information on installing Solaris 10 as a virtual machine on Mac Book, see my earlier posts:

Who said penguins can?t fly? -- Jorge Bernal

Reading Linux failing to boot screen on the plane I remembered I saw something similar in my flight to the MySQL Users Conference. It wasn’t so critical, but my entertainment screen got an unexpected reboot and I could see the tux logo and the console for like 2 secons while booting. This was with Northwest Airlines.

So, who said Linux had problems with video? :P

Installing MySQL on Solaris 10 Virtual Machine: gcc and cc Compiler Option Differences -- Frank Mash

I left the last post in this series at the point of running make for the bitkeeper client. If you have been following the posts and tried to do that, you will be greeted with the following errors:
bash-3.00# make
cc -O2 -Wall -Wno-parentheses bkf.c -o bkf
cc: Warning: option -2 passed to ld
cc: illegal option -Wall
make: *** [bkf] Error 1

The first line shows the compiler options being used followed by a warning and an error. The reason we are getting this error is because cc options != gcc options. We have two solutions at hand at this point:

1. Use gcc (we installed it earlier)
2. Change the compiler options to use cc's compiler options instead of gcc.

Using gcc compiler
To use the gcc compiler instead of cc, do the following:
bash-3.00# CC=`which gcc`
bash-3.00# export CC
bash-3.00# make

This will let you go past the first set of errors. make will now be stopping with the following errors.
bash-3.00# make
/usr/local/bin/gcc -O2 -Wall -Wno-parentheses bkf.c -o bkf
Undefined first referenced
symbol in file
gethostbyname /var/tmp//ccGSplTt.o
socket /var/tmp//ccGSplTt.o
connect /var/tmp//ccGSplTt.o
ld: fatal: Symbol referencing errors. No output written to bkf
collect2: ld returned 1 exit status
make: *** [bkf] Error 1

These errors mean that you need to set LDFLAGS as follows
export LDFLAGS="-lsocket -lnsl -lm"

Now running make should produce no errors.
bash-3.00# make
/usr/local/bin/gcc -O2 -Wall -Wno-parentheses -lsocket -lnsl -lm bkf.c -o bkf

Using the cc compiler flags
To change the gcc compiler flags to cc compiler flags, edit the Makefile and replace the line that specifies the gcc options with a line using options recognized by cc. So, you would find the line:
CFLAGS=-O2 -Wall -Wno-parentheses

and replace with
CFLAGS=-xO2 -v

Regarding -Wno-parentheses, James Carlson of Sun Microsystems pointed out the following:
Gcc does have it documented in the 'info' files; you just have to look
under "-Wparenthesis" in the warning options.

For the equivalent outside of gcc, see lint's "-errchk=no%parenthesis"

option. (Our tool chain doesn't get lint and cc confused. ;-})

Now running make will produce
bash-3.00# make
cc -xO2 -v bkf.c -o bkf
"bkf.c", line 196: warning: Function has no return statement : clone
Undefined first referenced
symbol in file
gethostbyname bkf.o
socket bkf.o
connect bkf.o
ld: fatal: Symbol referencing errors. No output written to bkf
make: *** [bkf] Error 1

To get rid of the undefined symbols, you would need to link the right libraries.

Al Hopper of Logical Approach Inc, once gave me a very handy tip to find more about the "mysterious" undefined symbol. He suggested using something like (run for both /lib and /usr/lib):
for i in /lib/*.so
do
/usr/ccs/bin/nm -Ag $i |grep -v UNDEF |grep gethostbyname
done


which produces:
/lib/libnsl.so: [2643]  |    110640|      89|FUNC |GLOB |0    |11     |gethostbyname
/lib/libnsl.so: [2683] | 110202| 222|FUNC |GLOB |0 |11 |gethostbyname_r
/lib/libnsl.so: [3113] | 128448| 174|FUNC |GLOB |0 |11 |_switch_gethostbyname_r
/lib/libnsl.so: [2841] | 110108| 44|FUNC |GLOB |0 |11 |_uncached_gethostbyname_r
/lib/libresolv.so: [1585] | 78996| 38|FUNC |GLOB |0 |11 |res_gethostbyname
/lib/libresolv.so: [1397] | 79034| 41|FUNC |GLOB |0 |11 |res_gethostbyname2
/lib/libxnet.so: [79] | 0| 0|FUNC |GLOB |0 |ABS |gethostbyname

This will tell us that we need the "-lnsl". Similarly running a slightly modified version of the above, we can find out that we also need "-lsocket" specified in LDFLAGS.

The approach I mention above is a bit controversial so use it only if you know the issues it could drag you into. There are other ways to get the same information such as "man socket" and "man gethostbyname" that will do the job in most cases.

export LDFLAGS="-lsocket -lnsl -lm"

At this point, you should be able to run make without any issues.
bash-3.00# make
cc -lsocket -lnsl -lm bkf.o -o bkf


to be continued.

Migration of database with special characters -- Frank Mash

Michael Chu has posted a review of how to migrate MySQL database with special characters properly.

The Perils of Perl, Do we need to take away your keyboard? -- Brian Aker

I love this stuff:

http://avatraxiom.livejournal.com/58084.html

... and frankly I am not going to be as eloquent as Chromatic:

http://use.perl.org/~chromatic/journal/33191?from=rss

Yes, PERL MADE ME WRITE BAD CODE.

The concept that a language makes you write bad code, or difficult code to maintain, is just bizarre.

Ok, if the programming language required you to only use i, l, and 1, as variables I might buy it... but for some reason, which I can clearly articulate over dinner if you are buying, I don't really see perl as the limiting reason.

Perl does not prevent you from setting up coding standards, writing good comments, or structuring the layout of ideas such that someone else can not follow what you did. Nor does it require your program to dump core to stdout and print "redrum, redrum".

That would be a nifty easter egg though.

MySQL could use a few more...

And if I had any artistic skills what so ever this post would have a 1950's style horror cartoon with big "Beware of Perl" title, but I don't and when I went looking in Google Images for the pictures I got lost in a link feast of nifty comic art.

Links to material and documentation -- Ivan Zoratti

You can find the material following these links:
We will publish the webex for Part 2 asap.

Questions and Answers in the Second Session of the Online Solutions with MySQL Webinar - On Replication -- Ivan Zoratti

Correction on the INSERT DELAYED
In slide 20 I have mentioned that the INSERT DELAYED statement can increase performance on the slave. This is wrong, since the DELAYED keyword is ignored by the SQL thread on the slave server. The INSERT DELAYED statement can increase the overall performance of an application since the control is returned to the client as soon as the row is queued into the list of inserts to execute. The INSERT DELAYED can be used with MyISAM, MEMORY and ARCHIVE.

Q from Filip: Does master & slave have to be the same db-version, and the same Operative system?
Not necessarily, you can have different versions and operating systems

Q from Danilo: Is there a way to load a backup from the master without locking tables or shutting down the master database?
Using Innodb, you can use mysqldump to produce a consistent backup without locking the tables

Q from Henk: How are primairy keys communicated back from the slave to the master
There is no need: the primary keys generated on the master are pushed "as is" on the slave / they are not regenerated
Q: How does a slave follow up an insert?
It receives the keys generated by the master, so if the slave implements foreign keys (although they would not be necessary), the data on the DB is consistent

Q from Martin: How can mysql prevent binary log corruption when link between master and slave breaks? This has happened to me, which stopped the replication thread indefinitely
The IO thread on the slave stops pumping the log. As soon as the master restart the slave restart from where it stopped

Q from Brendan: What is the latency between a transaction being committed on the master to the same transaction being committed on the slave in a replicated environment?
It depends on several factors - hardware, storage engine, workload etc. From a replication point of view, the transaction is first recorded on the relay log and then it's applied to the slave.

Q from Helen: Are all the DBs on the master server replicated? Can you selectively replicate databases?
You can selectively include or exclude DBs. This works down to the table level

Q from James: Are client-updates prohibited on a slave system when the master is active? If not then how do you prevent inconsistencies?
They are allowed. The slave can be used for any task. it is your responsibility to write only on the master. In order to make the slave read-only, you should work at a user-security level

Q from Jonas: If I understand right, I could configure the master on both servers to address them with a virtual IP, is that correct?
Yes, that is correct, you can do it in order to simplify and speed up the failover

Q from Volker: The HA solution can only work if the binlog positions on Master and HA slave are the same, right?
No, the slave can be a little delayed. For synchronous HA we use others techniques like IO replication (DRBD) or shared disk architecture

Q from Paul: How does master-master replication perform with cross continent latency? eg: UK & USA
MySQL Replication is asynchronous so it will not be affected by latency. The use of separated threads avoid that the geographical position would affect the activity on the slave DB.

Q from Danilo: What does "Non-deterministic writes" mean?
An example of it is an INSERT statement that contains the RAND() function

Q from Dave: Can replication handle 'load data infile' type operations?
Yes, the data loaded is stored in a temporary file in the tmp directory and passed to the slave through the IO thread. The SQL thread will load the data from a file available in the tmp directory of the slave server. The file are security protected from access as other.

Q from Owen: Could you use blackhole on Master 1 and split data afterwards to mutiples partioned DBs
Yes, blackhoe storage is a good way to build a relay before replication split

Q from Andrew: If the update is async, is there a possibility that an update x made after update y in the server is performed y and then x on the slave?
If done in a different transaction that commit in a reverse order it might happen. The locking mechanism of InnoDB and the default isolation level (repeatable read) make replication consistent

Q from Alessandro: I've never heard about BLACKHOLE storage engine. What is it designed for?
It is a storage engine that store nothing in the db but the binlog is generated anyway. It's used to build a relay server, reduce the use of resources and improve speed

Q from John: Can you specify a master db by host name, or must it be an IP address?
You can use either the hostname or the IP address

Q from Clive: Is there any drawback to using dual-master replication (with auto-incr offsets) to allow automatic client failover with complete safety?
Bear in mind that replication is asyncronous. For a complete safety, MySQL cluster might be the solution. or some other HA techniques : IO replication / shared disk

Q from Rob: Does master & slave have to be the same db-version, and the same Operative system. You can always replicate to a newer version right?
You can replicate to newer version and/or different OSs

Q from Rob: Is it possible to share query cache over multiple slaves?
No, the query cache are on a server basis

Q from Daniel: If I save binlogs in a NAS and net connection/write fails, will the server still work?
The master must be able to write to its current log and to generate the next one. Then, once logs are rotated, they can be pushed anywhere for safety. NAS is a solution.

Q from Carsten: Can 2 masters read from the same database files? (Implementing DFS for example) ?
This can be done using cluster-aware file systems, but there are some aspects to be carefully considered, such as the storage engine to use and the query locking.

Q from Phil: What do you think the impact of placing data and logs on different drives is?
This is an important point to optimize the IOs and it is recommended

Q from Manuel: I think multimaster replication writing on both to a table with autoincrement column will cause heavy fragmentation of data and index. How can this affect to query performance?
If load is fairly balanced between nodes, I do not see this as a problem. In other cases, it might be necessary to periodicaly optimize tables.

Q from Domenico: Is ti possible to replicate from more then one masters to fews slaves?
Currently, a slave can be linked only to one master. There are plans to change this and create multisource replication

Q from Phil: Is the use of high performance RAID important for the logs?
Absolutely, performance on the master will get benefit from this

Q from Alessandro: What is vertical partitioning?
It's a way to separate data by columns instead of rows. For example, if a large table has 10 columns, C1, C2 to C10, one can split this table into two tables: the first table will contain, for example, C1, C2 and C3, which are the most accessed columns for read and write. The second table will contain the primary key and the remaining 7 columns, that will be rarely updated. This approach can improve performance significantly, since the tables are smaller, the columns fit easier in the blocks used by the storage engine and in the caching mechanisms.

Q from Ian: When is it a good idea to choose MySQL Cluster rather than using replication for HA?
Replication and Cluster are two completely different technologies. In short terms, Replication is the right choice for scale-out solutions when there are lots of read operations, such as on web search and read. MySQL Cluster is mainly used when the ratio of read and write operations is closed to 1:1 and when queries affect small resultsets with direct key access or simple joins.

Q from Ap: What is the best method to load data into the slave, use mysqldump or use "load data from master"
LOAD DATA FROM MASTER is deprecated, the best solution is to use mysqldump

Q from Manuel: With master/slave replication you mentioned "automatic failover", how can you do that?
You can use clustering and HA software such as Linux HA to control the activity of the master. The software can switch over to the slave in case of issues. The client application will get access to the new master using a virtual IP address.

Q from Danilo: Is it possible connect master and slave with a serial link?
It's possible to connect master and slaves to any TCP/IP connection

Q from Jon: For those people who aren't lucky enough to have access to MySQL Enterprise - are there any other methods of monitoring slave status (e.g. SNMP)?
Yes, there are some monitoring tools that could help: Munin, Cacti and Ganglia might help. You can find information on these tools in the third webinar of the series (check the slides here)

Q from Jonas: If I have a 2-node cluster, with Heartbeat, each node has one IP (let's say IP1 and IP2, and share a VIP), can I configure both servers to be slaves, and the master be the VIP? Is not the active node both master and slave of himself?
Yes, in a sense, you can have a circular master replication and the master is accessible using the VIP

Q from Owen: What is best configuration for HA in a MyISAM environment with high OTW but also high muti joined selects for reporting?
IMO, the multi-joined selects are not affected by the HA solutions provided by MySQL. MyISAM can certainly help (provided that the system is generally safe from index corruption. If the requirement is to have a reporting server with multi-joined selects aside a classic OLTP server, then replication is probably the best solution.

Q from Manuel: When you configure a slave, you didn´t configure the log pos, why?
In the example, we supposed to start from a fresh installation. The log position is important when there the master and the slave must be synchronised while the system is running.

MMM Release 1.0-pre3 -- Alexey Kovyrin

MySQL Master-Master Replication Manager version 1.0-pre3 has been released today. Changes list is really short now:

  • Major fix in multiple clusters support - now you can use many clusters with one monitoring node (details are in mmmd_mon man page)
  • Man pages for mmmd_mon, mmmd_agent and mmm_control scripts
  • Startup scripts added/fixed for mmmd_mon and mmmd_agent
  • Installation script now requires iproute package to be installed on server.

As always, if you have any questions/suggestions, post them here or in mmm-devel mail list.

Tags: , , ,

Why Couchsurfing should be open (update, again) -- Morgan Tocker

Note: I've edited this post and updated it with more information / fixed some grammatical errors.

Note 2: Sorry to the planetmysql crowd reading this. I can't seem to find a way to make it not show up.


Digg!


Late last year, I started volunteering for couchsurfing.com. I knew they were suffering from database problems, and I knew that I could help.

I spent a week at the collective in New Zealand, and I helped add a few indexes here and there, but I've never really involved myself as much as I would like to, past the point of diagnosing what problems need to be fixed. I proposing a few potential solutions, but it bugs me at times that I'm not more involved.

This gets me to my point; there are some things I don't like about the project. I mostly keep quiet about them, because couchsurfing.com achieves more good than it does harm; negativity can be really bad. With Opencouchsurfing.org opening today, now seems like a good time to mention my issues - because I believe many of them are fixable. Consider this the "why I joined open couchsurfing movement" post.

Surfs up for the "Core Team"


This one might not be fixable. If you're a volunteer on couchsurfing, or you're linked to some key people on couchsurfing it instantly becomes easier to find a couch. For me, this goes against what couchsurfing is about since it potentially limits the experiences that hosts will be exposed to.

I think that:
* Any organisation structure should be flat and transparent.
* Any moderation system should be temporary moderation, and based on an arbitration system that has multiple people voting based on criteria. You don't have to implement a technical solution to a social problem, but I'd like to see us adopt a slashdot style voting system for disputes.

We keep user data, forever


I'm in two minds about this. Any communication between users is archived for all of eternity. We have a terms of service where we state that we do this, but at least in the EU we're skating some pretty thin ice. Pretend for a second that it was legal in the EU, I'm pretty sure we aren't legally allowed to read it (although as part of disputes Admins need to do this). Keep in mind that constitutional rights prevail over any conditions you put in a Terms of Service. In a court what we get our users to agree to might not stand up.

We do have a good reason to do it though; and that's to protect our members. If someone goes missing in a foreign country, then we want to hand over as much information as we can to the authorities; who they were staying with and where they might have been at certain dates.

I'm proposing that we have a "I've returned home safe, delete my data" button, or a maximum age to the information we store (3 months, 6 months, 12 months). I keep my own conversations off couchsurfing, and it's not something I'm proud to say that I have to do. I don't have access to the database server, I don't fully know how the backups work, and I can't trust that one day someone might do an under the table deal to sell my information.

I don't know if that will happen, but we've not insulated ourselves enough to stop it from happening. How many US government departments have had to admit they've accidentally leaked sensitive data?

Today I'm proposing a new idea for how admins work; I want them to have no access to read peoples information. If personA leaves bad feedback on personB, then personB should fill out a dispute request. PersonA has 14 days to (a) reverse their bad feedback, or (b) both users disclose the information that they want to show as evidence to the arbitration team. It doesn't have to be an automated system.

I think there will be some instances where we think "we could really solve this dispute fairer/better if we could read their profile". In those instances we hand the data over to police. It's the cost of free speech, anything less and welcome to 1984, enjoy living in your police state.

We have a non disclosure act. It has a non compete clause.


Non Disclosure Acts (NDAs) are common in the software industry. I've signed a lot of them, but I won't sign one that has a clause that says that I can not work for a business in opposition for a contracted length of time, or for all eternity. Some leading voices in the industry have protested about these clauses. The good news is that most of the time they are not even enforceable.

Given that couchsurfers "competitors" are never listed in the contract, this is quite vague. If you were to say "travel business", or "accomodation business" then I would already be in violation of this clause because of some of some of the customers at MySQL (www.my sql.com/customers/) I provide advice to daily.

And excuse me, couchsurfing is a not for profit!. People are unpaid volunteers. Traditionally volunteers have donated time in order to gain work experience, and now you're saying they can't work?

I would also like to think that all the couchsurfing style projects are ultimately aiming for the same goal of making the world a smaller place, we're not in competition. Ultimately, if hospitality club asked for database help, I would like to help them. If I sign the couchsurfing NDA, then I can't.

Note: This is something key group is aware of. The progress has been quite slow though, since people first NDA protesting I was aware of was around November. I hear that new version is "really almost signed off on", but we're all yet to see a draft. (Update: One of the people behind the new NDA said there will still be a non-compete clause, but limited for 1 year. Still not good enough, since I thought this would be taken out).

I actually never signed the original NDA, people thought I did and I never corrected them. A few days ago I publicly disclosed this when I was asked on a conference call. Now that that is known, I'm not sure what will happen. I might loose access to the developers private mailing list (which receives three times the traffic as the public list for the record).

We won't release any code under an open source license


If you contribute any code to couchsurfing, couchsurfing owns it. That's the way the current non-disclosure contract works. I believe that couchsurfing needs to be given some rights when you donate your code, since otherwise you might withdraw and say "you can't have it any more". However, I don't believe that you should ever loose your own rights.

This means that as a volunteer if I invent some super efficient algorithm of storing the available couches, so that all searches are now 100 times faster, I can't implement that same algorithm on Hospitality Club. I also can't show anybody how I did it - since the NDA prohibits letting any of the code see the light of day. This really concerns me.

There are a lot of software projects that release their code under an "Open" (OSI approved) license. This means that when I download the software, I can also download the source code that was written to produce the software (Firefox, Linux, MySQL, Apache are all popular examples of products that use this style license).

When I was younger and trying to get a better job, I used to contribute to these open source projects so that I could show my skill in my portfolio, in much the same way that young lawyers work pro-bono. I needed to do this because you can't tell exactly how good a programmer is by seeing the outputted product; you need to look at the underlying code. When you see someone's code, you can tell how good they are. Did they correctly discover all opportunities for looping structures, or did they notice that by making a couple of small changes, 50% less memory would be used.

In software you can tell a good programmer from a bad one, and we're depriving our developers from adding couchsurfing to their portfolios.

The main argument I've heard for this is we don't want to expose the software source for "security concerns"; I don't support this argument. It's a method of security called security via obscurity; which is often critiqued as only being skin deep. Anyone with skill and time can find out what's happening when you feed XYZ into a blackbox and get back ZYX that the letters are being reversed. The fact that you can't see the code reverses it will only slow you down a little.

If we really were concerned about security, we would patch our servers, and not expose the database on a public IP address. Callum has criticized this himself here.

I would like to argue the opposite, in that open source increases security. They say that "many eyes makes problems shallow". Developers will produce better code when there's more scrutiny, and there will be more people to find the faults in the code before people start trying to attack the servers. There's also no way of telling if people are already attacking the servers and we don't know it.

There's not enough financial transparency


It costs real money to run couchsurfing. We have at least 10 servers running, and even with the cheapest hosting providers that's going to cost a couple of thousand per month.

People have public criticized this before, saying "where is all the money from donations going??". I don't think there is much left over after expenses, but as the site expands I'd like the precedent to disclosure now. If you look at wikipedia, another not for profit, it is easy to find where their money comes from.

One point others have criticized, is that couchsurfing pays Casey's airfare to attend collectives. I actually support this, but I would have liked to see that others were eligible for partial travel assistance. My personal costs to go to New Zealand for 1 week, were about $1,000 Australian Dollars. Joe once posted to the mailing list the personal costs he'd incurred in taking a few months off to volunteer; It's not cheap.

Perhaps we could say "there's a budget of $2000 for travel assistance to this collective". Then each of us could apply for what ever amount of partial assistance we desire, and have a committee vote on who receives assistance; it's not too much different from how conferences organize speaker funding.

However in our organization this is quite different. The board consists of one member, so ultimately[1] it's Casey's decision (I cover this in my next point; there's no elections).

If there's enough money left over, I'd like to support that someone in a key positions like Casey can draw a small cost of living wage so that they will be able to buy a new laptop, every now and then, and dedicate more time to the project. WIkipedia has a few paid positions, but my understanding is that these are paid below market rate.

There's no elections for positions


Not long ago, we needed to find a new tech team leader. The very next thing we get is post to our developers list from Casey congratulating Chris on his appointment.

I like Chris; he knows his stuff and has previously been quite active. However, if we had elections and I got an opportunity to vote, I probably would have voted for Anu. My main reason for this is that in the last few months Chris' contributions have been minor.

There are about 40 people on the developers mailing list. Their contributions vary, but I believe they should have at least been given an opportunity to (a) stand as a candidate and (b) vote on a leader. Since the tech team impacts other teams, they should have been allowed to vote too.

Elections should span to Admins, ambassadors, and the board to stop what I described as my first problem (the core team surfs easier). Terms of appointment should be a fixed length. I think ultimately if there were elections for the board, Casey would be re-appointed; he's a good guy. So are a lot of the others.

It's not that complicated to run an election. We could use a model similar to wikipedia, or the condorcet vote used by the Debian group. Heck, I could even install memberdb for you, since I know how the code works.

Anyway, thanks for listening. Those are my thoughts, but feedback is welcome.

If you agree with any of my points I encourage you to do just two things:
Sign the petition.
Join the Open Organisation group

[1] There is a proposed new "leadership circle", but details on how this works are not yet available outside the circle.

Installing MySQL on Solaris 10 Virtual Machine: Other tools -- Frank Mash

Before we can install MySQL using bitkeeper, we need certain tools to be installed.

Download Bitkeeper Client

From Sunfreeware.com
- Install wget-1.10.2-sol10-x86-local .
- Install gcc-3.4.6-sol10-x86-local or libgcc-3.4.6-sol10-x86-local.gz
- Install libiconv-1.11-sol10-x86-local (required for gcc)
- Install openssl-0.9.8e-sol10-x86-local (required for wget)
- Install make-3.81-sol10-x86-local (required for bk-client)

After downloading the above packages, navigate to the directory and then run
gunzip *.gz

Now you can install the packages one by one.
pkgadd -d libiconv-1.11-sol10-x86-local 
# this will install SMCliconv

pkgadd -d gcc-3.4.6-sol10-x86-local
# installs SMCgcc

pkgadd -d openssl-0.9.8e-sol10-x86-local
# installs SMCossl

pkgadd -d wget-1.10.2-sol10-x86-local
# installs SMCwget

pkgadd -d make-3.81-sol10-x86-local
# installs SMCmake

At this point, if running wget gives you "command not found", then make sure you have added /usr/local/bin to your PATH. The documentation will be installed in /usr/local/doc.

Now we can install bitkeeper client.
wget http://www.bitmover.com/bk-client2.0.shar
/bin/sh bk-client2.0.shar
cd bk-client2.0
make


to be continued...

VMWare 6 rocks -- Reggie Burnett

So I've been using Virtual PC 2007 since it was released but was always frustrated with its inferior Linux support.  I don't really understand why Microsoft drags their feet on this.  With more and more devs moving to vm only development, it's getting easier and easier to change your host platform and still development for Windows.  I'm sure Microsoft would much rather you run Windows as your host platform and run Linux in a vm when you need.   I've tried VMWare Server but since I run Vista x64 as my host, that's more pain than I really like.

Enter VMWare 6.  Sure it installs and runs on Vista x64 like a champ and performance seems great but what really blew me away was the multi-monitor support.  I've been using twin 19" monitors for nearly 2 years now and I simply can't imagine life without them.  Folks I kid you not.  It was easier to get Ubuntu running on two monitors using VMWare 6 than it was running on my actual hardware.  I installed Feisty Fawn and then followed that up by installing the VMWare tools  (which installed without a hitch).  At the final stage of the VMWare tools install, it asks me what resolution I want to use.  I gave it a reasonable response, finished with VMWare tools, and then logged off so I could restart X.  Bam, the screen jumped to some huge resolution.  Something like 3000x2000.  I hit the maximize button on the VMWare screen and the Ubuntu desktop neatly filled both my monitors.  No editing of any x config files.  No Windows desktop elements visible.  No Windows taskbar.  Nothing.  Sweetness.

Multi-monitor support is not supported in Windows Server 2003 but XP works like a champ.  VMWare 6 puts me a little bit closer to vm only development.

MySQL Table Checksum 1.1.0 released -- Baron Schwartz (xaprb)

MySQL Table Checksum 1.1.0 adds many improvements, but the most important is a new way to ensure slaves have the same data as their master. Instead of checksumming the slave and the master, it can now insert the checksum results directly on the master via an INSERT.. SELECT statement. This statement will replicate to the slave, where a simple query can find tables that differ from the master. This makes a consistent, lock-free checksum trivially easy.

There are also many other feature improvements and bug fixes, compatibility with MySQL 3.23.2 through 6.0-alpha, and finally I've gotten the documentation finished to my satisfaction.

Free Solaris 10 OS and Sun Developer Tools, or Solaris Express, Developer Edition DVD -- Frank Mash

Since I have been writing these days about working with MySQL on Solaris 10 virtual machine on Mac Book, I thought why not blog about a very cool offer by Sun. As a part of their media kit program, you can get a free Solaris 10 DVD of your choice. The choices you have are:

1. Solaris 10 Operating System and Sun Developer Tools, or
2. Solaris Express, Developer Edition for x86 systems.

With one request, you can request as many as 9999 licenses. What a great way to market your product. For those who would be interested in installing Solaris 10 for your Mac, this is a good opportunity so you can have a copy of Solaris 10 on DVD.

This is a very cool and smart move by Sun. There are certainly lessons to be learned here for other open source companies like MySQL.

One thing I am wondering about is that whether MySQL is included in the Solaris Express, Developer edition. Probably not, but that goes on to show you a golden marketing opportunity for both companies. Companies that are already using Sun hardware and Solaris 10 will be much more inclined to become paying MySQL customers if MySQL and Sun work even more together to promote and optimize MySQL on Solaris. Solaris 10 is a great operating system and MySQL, a great database. I would actually recommend that both the management of Sun and MySQL read the book "Getting to Yes: Negotiating Agreement without giving in" by Roger Fisher. It really is a great book that teaches to focus on solving issues rather than taking positions. Taking both hard and soft positions to negotiations is never helpful.

Anyway, back to the free DVDs. According to the confirmation email from Sun, you can expect to get the DVD in 2-4 weeks. The email also included some helpful links about Solaris 10, which I have included below for reference.

Solaris Learning Centers:
http://sun.com/solaris/teachmesun.com/solaris/teachme

Solaris How to Guides:
http://sun.com/solaris/howtoguidessun.com/solaris/howtoguides

Solaris Training & Learning Services
http://sun.com/training/solarissun.com/training/solaris

Sol