My SQL is a bit rusty and my brain just won’t work today, any SQL gurus out there want to lend a hand ?
I have a number of sections in the book store with a number of books in each section, the book_section table connects the books to the sections.
What I’m trying to achieve is a SQL script that will return all book section names and an indicator if any of the books in that section have a sale_price. I want all the sections listed, just with a flag that will alow me to determine if there are any books on sale in that section.
Table sections
section_id
name
Table books
book_id
name
price
sale_price
Table book_section
book_id
section_id
So I want to get a list like
Computers Y
Cookery N
Romance N
Business Y
In the books table I may have data like this;
1 SQL for Dummys $50.00 $0.00
2 Php Beginners $25.00 $15.00
3 Italian Cooking $10.00 $0.00
4 M & B hits $2.99 $0.00
5 Seven Habits $60.00 $10.99
I have to use MySQL 3.23, so correlated sub-queries are out, any ideas how I can get what I need using joins etc.. Also I inherited the database schema, and can’t change it very much.
If there is any way you can alter the schema, I’d add a flag to the book_section table:
alter table book_section add has_on_sale enum('Y','N') not null default 'N';
Anytime a sale price is add/removed from a book, update the section flag. IMHO, this will make your sql queries more efficient and application code easier to understand and maintain.
The criteria for a book on sale is that there would be a greater than zero value in the sale price. I wanted to avoid adding a flag so all that would happen, is that when the sale price gets set to a value greater than zero, the section would indicate that there was a sale on, makes it much easier for the person updating the system if all they do is add a sale price to a book and it magically appears on sale.
There’s less probability of user error if there’s only one step to perform, if I add a flag at the section level, when a book goes on sale, they have to update the flag, and of course when they take the book off sale, they have to remove the flag. Also if the book gets sold out, the “sale now on” could automatically disappear from the section if it was based purely on the value of the sale price.
The updating of the has_on_sale flag should be a function of the application, not the end user. ie, anytime a book gets added/updated, test if that section has any books on sale and update the section has_on_sale flag as well.
If you wanted a separate on_sale flag on each book as criteria, the same would hold true. The end user just enters a sale price, and the flags are updated accordingly by the application.
SELECT book_section.section_id, sections.name SUM( books.sale_price >0 ) AS On_Sale, FROM book_section, sections, books WHERE book_section.section_id = sections.section_id AND book_section.book_id = books.book_id GROUP BY book_section.section_id
And then use On_Sale > 0 to flag Yes. Or something like that.
I agree that adding a flag would be handy but not good database design. If you want a normalized database, you shouldn’t have a field that can be derived from another field.
I am interested to know how to properly solve your problem. I have an idea that is less than elegant and untested, but I think it should work.
First get section names having books on sale. Then to get section names with books not on sale. The queries are almost identical - see below:
$arr = array(); $query = $this->db->query('select distinct section.name from sections, book_section, books where sections.section_id = book_sections.section_id and books.book_id = book_section.book_id and books.sale_price > 0'); foreach ($query->result() as $row) { $arr[$row->name] = 'Y'; }
$query = $this->db->query('select distinct section.name from sections, book_section, books where sections.section_id = book_sections.section_id and books.book_id = book_section.book_id and books.sale_price = 0'); foreach ($query->result() as $row) { $arr[$row->name] = 'N'; }
If you want the results ordered alphabetically you use the ksort php function:
ksort($arr);
EDIT
J. Decker has a better solution I think - well done.
mohrt; Yes agree the updates should be part of the app, not user dependant at all, just trying to simplify the app at the same time as making it simple for the user. The fewer lines of code I can write to achieve the app, the fewer possibilities for bugs
I was pretty sure that I was missing something and that I should be able to do 1 SQL select and get the book title and an indication of the on-sale status.
I could do it with a temporary table, built from a join of the book_section and books table where sale_price > 0 grouped by section_id and then left join this to the sections table to give me a list of all the sections with either a sale_price > 0 or NULL
However, I don’t like the idea of using a temporary table !
JD; nice bit of lateral thinking, sort of what I was looking for, only thing I can’t work out, is why with the temp table solution, I get 40 rows returned (correct amount) but with your solution, I get 36 ??? Can’t see anything wrong with your logic, so I’m still missing something.
I agree that adding a flag would be handy but not good database design. If you want a normalized database, you shouldn’t have a field that can be derived from another field.
You know, I used to think this way. Normalize everything. But I have since changed my mind. What it comes down to is a simple trade-off… data normalization vs. query speed and application code maintenance. In my case, select speeds are crucial.
I’ll take an example from Motortopia. We have members with cars, photos, videos, etc. When I select a member from the database, I need to know the number of cars, photos and videos they have.
I could construct a lengthy query containing joins and/or sub-selects to get these values on-the-fly, but this is process-intensive for the db server, not to mention repetitive (ie. everytime I select member A, the same numbers are ground out of the other tables.) Proper indexing and server-side query caching help, but only get you so far.
So instead of running these complex queries every time I select a member, I keep columns in the member table that track the number of cars, photos and videos. Then the query becomes a simple select from the member table.
Now the “drawback” of this design, I have to update these tracking numbers anytime cars, photos and videos get added or deleted. (BTW, this is a perfect spot to used stored procedures or triggers in the db!) With this little bit of work upfront, I have saved myself a LOT of processing time selecting records from the database. When your site traffic grows to very large numbers, you will be thanking yourself that you got rid of those joins and sub-selects.
I still use joins where it makes better sense, but I think I got rid of every sub-select. Anytime I find the need for a sub-select, it is usually better taken care of with a column in another table to keep the select queries speedy.
I have a similar experience, I used to normalise everything down to 3rd or 4th normal form, but with experience you do tend to change things, I still normalise, but it’s more like down to 3rd NF and then back it off a bit for exactly the reasons that you mention.
A short time ago, I was exposed to a CMS that used one table for each function, i.e. lots of redundant data, but it did open my eyes to the fact that coding applications was sometimes far simpler with just the one table to deal with. My mantras are KISS, code it once, code it correctly, so I try to simplify everything that I do and try to think about the best ways to ensure that the user is not put in a position where they create problems that the programmer then has to sort out.
One of the reasons that I like CI so much, is that you can accomplish a great deal with relatively little, readable, understandable code (same goes for the documentation of CI).
I also code in C# (trying to get rid of it) and I think it’s a pretty good example of how to overcomplicate things, it takes ages to write anything in C# (unless you just do the VB type thing in Visual Studio).
When selecting CI, I also looked at Cake, Symphony and Qcodo, but the documentation, relative simplicity and clarity of CI won me over rapidly. I had a big project to write in a hurry so CI really made sense, as a self employed coder, getting the job done, swiftly, accurately and robustly is the top priority, and CI fits the bill (Thanks Rick).
Oh and one job I had to integrate two different databases, over which I had no control, this is where the flexibility of CI is so critical, using Qcodo pre-supposes that you start with your own rational, well designed database, using an MVC framework that starts by building from the database sounds great, but you really need to have total control over the database design, otherwise it becomes messy very quickly.
mohrt, you seem to bend the normalization rules intelligently. Your example is practical and I like it. I’m not sure I agree that code maintenance is always improved when using redundant database fields:
data normalization vs. query speed and application code maintenance
Sure, queries can be simpler and faster. But to maintain your app, you (or anyone else) would have to be aware of every redundant field in your database. For example, you would have to be careful when using a database util like phpmyadmin or scaffolding to add/remove cars, photos, and videos.
What I mean by code maintenance is the complexity of the queries. A select from one table is much easier to read and understand than maintaining a bunch of joins and sub-selects.
As for maintaining the “redundant” fields, You can handle them with stored procedures (or triggers) in the db. With those fields fully automated, it simplifies the application and tightens the integrity of the data, similar to how it was with full normalization. This should also make phpmyadmin and scaffolding work again without worry
[edit] I realize using db-side features diminished the portability of the code, so this is very application specific. In my case the code is for my own use, and query select speed is the utmost of importance, so all these little tricks help!
Rick Jolly had the right idea, just need to take it one step farther
select distinct section.name, 'Y' from sections, book_section, books where sections.section_id = book_sections.section_id and books.book_id = book_section.book_id and books.sale_price > 0 UNION select distinct section.name, 'N' from sections, book_section, books where sections.section_id = book_sections.section_id and books.book_id = book_section.book_id and books.sale_price = 0 order by section.name
That’s what I was looking for, it’s all coming back to me now… one small problem, the version of MySQL that I have to work with is 3.23 so no UNION ... I could use a merge table, but then again it’s probably better for me to get the hosting changed to MySQL 4.1 or better anyway.
Thanks guys
Code Igniter forums, another reason CI is my weapon of choice !
SELECT s.section_id AS `section_id`, s.name AS `section_name`, (CASE WHEN SUM(b.sale_price) > 0 THEN 'Y' ELSE 'N' END) AS `section_has_book` FROM sections s LEFT JOIN book_section bs ON s.section_id = bs.section_id LEFT JOIN book_section b ON bs.book_id = b.book_id GROUP BY s.section_id, s.name
which gives only one active record query :
$this->db->select('s.section_id as `section_id`, s.name as `section_name`, (CASE WHEN SUM(b.sale_price) > 0 THEN \'Y\' ELSE \'N\' END) as `section_has_book`'); $this->db->from('section s'); $this->db->join('book_section bs', 's.section_id = bs.section_id', 'LEFT'); $this->db->join('books b', 'b.book_id = bs.book_id', 'LEFT'); $this->db->groupby('1, 2');
Not tested, but the logic is there.
Maybe for MySQL 3 you just should get rid of the ‘join’ clause and use more ‘from’ and ‘where’ clauses. The ‘case’ clause is well supported on MySQL > 3.23.3, so don’t be afraid using it.
Edit : sorry for answering so late, I didn’t realize this thread was more than two months old. I just saw that there was a problem, a recent answer and nothing else.