[GRLUG] A data organization problem related to tags. Any database experts around?

Michael Mol mikemol at gmail.com
Wed Apr 14 13:11:26 EDT 2010


On Wed, Apr 14, 2010 at 12:46 PM, Adam Tauno Williams
<awilliam at whitemice.org> wrote:
> On Wed, 2010-04-14 at 09:55 -0400, mikemol at gmail.com wrote:
>> I don't *modify* their quieries, or really even know what they look like.
>
> Understood, but changing the queries isn't necessarily the point of
> EXPLAINing a query.

Ah, gotcha.

>
>> >> 15-minute load averages of 0.11, and more IO waiting than I'm
>> >> comfortable; I need to get in and tweek InnoDB's settings again.
>> > Which as a DBA makes me think: (a) something is creating temporary sort
>> > tables [and MySQL ***SUCKS*** at sorting], (b) something is creating
>> > temporary join tables [a buffer space issue or really dumb join], or (c}
>> > [worst case] the engine is falling back to a sequential scan due to bad
>> > query design or incorrect indexing [often the issue of cardinality comes
>> > into play here]
>> > To determine any of those you have to look at the problem on the query
>> > level.
>> > Maybe current MySQL can autolog EXPLAIN of queries for you?
>> I can enable profiling, sure, but it's not going to tell me anything I can
>> or will change. The queries reside in a place not safe for me to touch.
>
> They will tell you what queries are slow; and then possibly if the
> indexing is wrong or ineffective - you can certainly change the indexing
> - or what inserts/updates are slow, or what joins are slow.  Things can
> be addressed involving all those things without changing the table
> schema.  You can change indexing (modern RDBMS have many indexing
> options), fragment data, and alter page and extent sizes.  All which can
> have a very significant impact on performance.  Dealing with performance
> issues in this manner, with existing applications [and thus *their*
> schema] is what a "DBA" does;  it is an entire profession.

Mm. I've had occasion and need to tune these things related to work,
but it seemed absurd to me that MediaWiki, which already powers a site
like Wikipedia, might need me make modifications to the tables that it
created itself. (About the only direct change I've made, in that
regard, was migrating engines. The original RC install, years ago, was
done using MyISAM, and it was later recommended to me that I switch to
InnoDB. Then I switched one of the tables back, because MyISAM was the
better option for that table. (MW supports InnoDB out of the box, but
it's an installation-time option.)

>
>> My complaint isn't with MySQL as it's performing under my load, it's
>> with MediaWiki for not being able to express the data to the end user
>> in the way I want,
>
> You originally framed the question as a data-model / performance
> relationships.  So discussing performance issues related to this
> data-model seemed relevant to me.

Performance issues relating to the pivot model were what I was
interested in, because I'm trying to consider what the best data back
end would be for a new system. MediaWiki's particular database
workload performance concerns were irrelevant to me, under the notion
that I wouldn't plan on using MediaWiki, going forward.

>
>>  and this whole issue of MySQL performance as it *currently* sits on
>> my site appears to have stemmed from a comment I made while trying to
>> show I do know *something* about how to improve the site's
>> performance.
>
> No, MySQL performance cannot be unlinked from the performance of an
> application that uses MySQL.

Yes, different workloads require tuning in different areas. There are
tuning parameters outside indeces, though, and that's where I've had
good success focusing my efforts for RC in the past. In particular,
tuning the InnoDB engine provided even more benefit than simply
throwing more RAM at the VM.

-- 
:wq


More information about the grlug mailing list