Web Analytics Made Easy -
StatCounter SELECTing WHEN chars >= 4000 - CodingForum


No announcement yet.

SELECTing WHEN chars >= 4000

  • Filter
  • Time
  • Show
Clear All
new posts

  • SELECTing WHEN chars >= 4000

    I have a table of articles, and each article has a "chars" field storing the number characters it has. There are two kinds of SELECTs I'm looking to do:

    1: Select until @var >= 4000. So MySQL would select a row, increment the variable by 'chars', and then check to see if it's >= 4000 before continuing.

    2: Select five rows that appear AFTER the first 4000 characters. So, five rows after whatever rows were selected in #1.

    Is this sort of thing doable in a query, without the use of PHP? If not, then oh well. The problem is that I'm doing the first query by selecting 10 rows, then using PHP to go through and get 4000 characters, then discard the rest. Then, and here's the bugger, I cache the output. Later on the page, I show five "most recent" articles using a variable set in the now-cached output to tell the query at which row to start pulling (WHERE id <= $last_id). When the first bit's been cached, the script doesn't run and the variable isn't then. Then the next bit on the page just errors out because it has nothing to go on.

    Well, any suggestions would be appreciated.

    Thanks in advance.
    offtone.com | offtonedesign.com

  • #2
    I'm not sure why you're storing the character count when you can use the CHAR_LENGTH() function to retrieve that value.

    How are you storing the text of the article? You were making it sound like you are storing each line of text from an article in its own row. If that's the case you'd probably be better off using the data type "text" and putting the entire article in it. You can easily select the first 4000 characters from an article using the SUBSTRING() function.
    SELECT SUBSTRING(article_text FROM 1 to 4000) as article_part,
    CHAR_LENGTH(article_text) as article_length
    FROM table


    • #3
      Ah, no. Definitely not storing one line per row. They are text fields. I'm storing the character count as a field because its value is strlen (strip_tags ($value)); I strip HTML out first, because it can account for more than a few characters sometimes. I suppose though that it wouldn't really matter, or since I'm caching the result anyway I could do a REGEXP_REPLACE inside the CHAR_LENGTH to remove tags before counting.

      Still though, how would the query know to break out once it's taken 4000 characters? The rows I select my look like this:

      | id | chars | @chars_so_far | 
      | 19 | 352   | 352           |
      | 18 | 634   | 986           |
      | 17 | 568   | 1554          |
      | 16 | 494   | 2048          |
      | 15 | 874   | 2922          |
      | 14 | 473   | 3395          |
      | 13 | 437   | 3832          |
      | 12 | 857   | 4689          | <-- Stop here
      | 11 | 203   | 4892          |
      | 10 | 657   | 5549          |
      In this example, I'd only really need to grab up to ID 12, as it's the first one past 4000. I still need whole articles returned, but I need no less than 4000 characters displayed.

      I'm going to head on over to the MySQL documentation site to see what can be done with variables.
      offtone.com | offtonedesign.com


      • #4
        Here's the would-be solution:

        SET @rows=0,@chars_so_far=0,@limit=100;
        SELECT @rows:[email protected] + 1, @chars_so_far:=chars + @chars_so_far, @limit=IF(@chars_so_far >= 4000, @rows, @limit) FROM content ORDER BY id DESC LIMIT @limit;
        I say "would-be" because there's a bug (or "missing feature") preventing the use of variables in the LIMIT clause. So I'm off to look for a Plan B.
        offtone.com | offtonedesign.com


        • #5
          Ahhh I completely misunderstood your first post, sorry

          I would actually not try to do what you're trying to do inside a query but rather use your server language (PHP or whatever) to fetch rows and tally the chars field using a PHP (or whatever) variable. Simple loop checks tally and bails when it reaches 4000.


          • #6
            That's what I do now, but I'm caching the output from the first and then the second fails once it's cached. It's cached as a file named by getting an md5 of the serialized object, so when the first one gets the 4000 characters, it passes the ID onto the next one which then queries WHERE id < $last_id LIMIT 5 to get 5 after the first 4000 characters worth of rows. Then it caches, too. But the second time around, when both are cached, the query for the second step is different ($last_id doesn't exist anymore) and thus the md5 () is different and it decides that there isn't a cached version of the output.

            I think I'll have to use a different caching method.
            offtone.com | offtonedesign.com