Viewing 9 posts - 1 through 9 (of 9 total)
  • Author
    Posts
  • #1471688

    Hi guys,

    I have discovered a huge problem with the way avia queries run on woocommerce sites.

    We noticed that some of our smaller much lower traffic sites were suddenly causing mysql memory use to climb out of control – Where the longer the server is running the more memory is used.

    We are talking going from about 8gb usage to 80gb usage and climbing as more and more bots get stuck crawling the queries

    Bots are getting stuck following the queries in the dropdown like:
    ?avia_extended_shop_select=yes&product_order=relevance
    They are crawling every single query string on every single paginated page.

    This is causing excessive memory usage.

    Anything we can do to stop crawlers hitting these pages?
    I can see rel=nofollow in the links but this doesnt seem to be stopping bots.

    In comparison the default woocommerce filters are not actual hyperlinks they are a form submission. Therefore bots dont crawl them.

    This is a serious problem.

    #1471720

    Following on from this

    I can 100% confirm this was the cause of our server issues.

    Essentially ALL Enfold websites with WooCommerce installed will gradually use up the ram and see mysql memory usage rise until the server crashes.

    The bots getting stuck crawling all of the ?avia_extended_shop_select=yes queries includes:
    Even with the rel-nofollow:
    Bing bot
    Google bot
    A majority of major search engines
    AI crawlers – New aggressive bots that ignore all instructions

    See screenshot of the before and after. Our Ram use fell from almost 90GB of ram used to 8GB used on average.
    Physical Ram usage screenshot
    https://www.dropbox.com/scl/fi/tw8p399gfdr9cclorc5j9/ramuse-screenshot.PNG?rlkey=d9ty4xvui4w3m2ycb9b9ez94s&st=28aameqn&dl=0
    Mysql memory usage screenshot
    https://www.dropbox.com/scl/fi/iixhdd5vpe7253tbavbxf/mysql-memory-usage.PNG?rlkey=15061shf0haiu3d36cdy0be3t&st=ep9fncx5&dl=0

    SOLUTION
    I propose that Enfold removes the custom sort by options and reinstates the default Woocommerce ones.
    The defaults use form fields and JS so there are no a href links in the default woo sort by dropdown. Bots cannot follow these links because there are no urls in the HTML.

    In your child theme functions.php add the following to remove the enfold filters and reinstate the woo ones.

    // remove the enfold sort by filters
    function avia_woocommerce_frontend_search_params()
    {
       return;
    }
    // reinstate the woo default sort by filters
    add_action( 'woocommerce_before_shop_loop', 'woocommerce_catalog_ordering', 20 );

    Add the following CSS to quick CSS and tweak depending on your sidebar position:

    .sort-param-count {
    display:none;
    }
    .product-sorting {
    padding-top:0px;
    }
    div .product-sorting ul, div .product-sorting li {
    font-size: 16px;
    }
    div .product-sorting ul {
    width: 200px;
    }
    .main_color .sort-param a {
        color: #000000;
    }
    .sort-param-sort a, ul.sort-param-order {
    border: 1px;
    border-color: #969696;
    border-style: solid;
    }
    @media only screen and (max-width: 767px) {
        .responsive #top .woocommerce-ordering {
            position: relative;
            float: left;
            clear: both;
            margin: 0;
            padding-bottom: 25px;
            padding-top: 15px;
            top: 0px;
        }
    }
    @media only screen and (min-width: 768px) {
        .responsive #top .woocommerce-ordering {
            position: relative;
            float: left;
            clear: both;
            margin: 0;
            padding-bottom: 25px;
            padding-top: 15px;
            top: 0px;
        }
    }
    #top.woocommerce-page .woocommerce-ordering select {
    width: 100%;
    font-size:16px;
    }

    Install the Redirection Plugin:

    Add the following RegEX expression to redirect the queries so that the URL redirects BEFORE the query runs on the DB

    Source URL: ^/(.*?)/\?avia_extended_shop_select=.*
    Enable: Ignore Case, Regex and Ignore Slash
    Target URL: /$1/
    Hit Save
    This will redirect any attempt to crawl the enfold filters back to the current category

    #1472046

    Hi,
    thinkjarvis, thank you for providing a solution to your issue, but I’m not able to reproduce this on my demo site, possibly I don’t have bots attacking it.
    Nonetheless, a core change like this will require the Dev Team to review, and examine what backward compatibility issuses it may cause, please post to the Github Feature Request for the Dev Team to review this.

    Best regards,
    Mike

    #1472049

    I’ll be honest. I don’t know how you would recreate this without looking at an existing and well established enfold woocommerce site.

    I can provide some screenshots of the plesk logs if you like? As evidence.

    Our server is one of the most secure out there. We run three firewalls and have active malware protection. We even have bot blocking in place. But we cannot block Google and bing for obvious reasons.

    It appears that most ai bots and even Google and bing are ignoring the no follow added to the enfold filters on woocommerce sites.

    This is a really serious problem. You won’t be able to see this without a private server where you get full access to the logs.

    Please let me know if you want to see some examples. This hit every single enfold site on our server and has been going on for several years. It’s just reached a critical point due to newer ai bots joining in and crawling links they shouldn’t be.

    #1472051

    Hi,
    Thanks, I still advise posting to the Github Feature Request for the Dev Team to review this, as it would require a core change.
    Note that nofollow is up to the bots to obey, it is a recommendation, there is no way to force this. Even the robots.txt is a recommendation, only a server RegEX may totally block.
    Perhaps the Blackhole for Bad Bots plugin may help.
    But you seem to be asking for a core change that would need to go through the Dev Team Github Feature Request

    Best regards,
    Mike

    #1472055

    Hi Mike,

    Can you send me a link to the Github account so I can raise the concern?

    This has effectively increased the capacity of our dedicated server from about 300 websites to about 1000.

    The ramifications of this problem are huge.

    #1472061

    Hi,
    Yes, Github Feature Request
    Click on Issues:
    Screen Shot 2024 11 24 at 5.48.23 AM
    Then New Issue:
    Screen Shot 2024 11 24 at 5.50.13 AM

    Best regards,
    Mike

    #1474819

    Do we have any response to this yet?
    https://github.com/KriesiMedia/Enfold-Feature-Requests/issues/114

    Following my solution above to reinstate the default filters:

    I now use the following more aggressive redirect in the redirection plugin:
    Source URL: ^.*avia_extended_shop_select.*
    Enable: Ignore Case, Regex and Ignore Slash
    Target URL: https://www.domainname.com/shop/
    Hit Save

    #1474836

    Hi,

    There will be a fix in next release 6.0.9.

    Best regards,
    Günter

Viewing 9 posts - 1 through 9 (of 9 total)
  • You must be logged in to reply to this topic.