Skip to content

Performance Issue in 64-bit Builds Compared to 32-bit (Filter Decision Suspected) #274

@yesilcimenahmet

Description

@yesilcimenahmet

Description:
There is a significant performance difference between 32-bit and 64-bit builds when using libspng. After detailed analysis and discussion in the zlib-ng repository, it was observed that the 64-bit build performs considerably worse compared to the 32-bit build.

Observed Behavior:
In our tests, encoding a PNG with 32-bit results in ~131 ms, while the same image takes ~400 ms with the 64-bit build.
Both tests were conducted on the same machine with the same configuration, using libspng and zlib-ng.
Detailed Analysis:
Filter Decision as the Cause:

Based on profiling, it appears the 64-bit build takes significantly longer because of differences in the filter decision logic.
The 32-bit build uses a more optimized path (e.g., SIMD vectorized loops), while the 64-bit build appears to rely on scalar operations.
Forcing the filter choice to SPNG_FILTER_CHOICE_NONE resolves the performance issue and brings the 64-bit performance in line with the 32-bit results. However, this is a manual workaround.
Filter Logic Differences:

libspng dynamically selects filters during the encoding process.
It seems the heuristic for choosing filters differs between 32-bit and 64-bit builds, potentially due to underlying differences in how zlib-ng operates in these environments.
zlib-ng Findings:

The analysis in the zlib-ng repository revealed that the 64-bit build might have suboptimal behavior in encode_scanline.
Scalar operations and loops dominate the profiling data in the 64-bit build, while the 32-bit build uses SIMD vectorized loops effectively.

Steps to Reproduce:
Use the provided C++ example to encode a raw image into a PNG with libspng.
Compare the encoding times between 32-bit and 64-bit builds.
Optionally, set the filter choice manually to SPNG_FILTER_CHOICE_NONE to observe how it impacts the 64-bit performance.

resultCode = spng_set_option(ctx, SPNG_FILTER_CHOICE, SPNG_FILTER_CHOICE_NONE);

Expected Behavior:
Both 32-bit and 64-bit builds should perform similarly, with comparable encoding times and efficient use of filters.

Links to Related Issues:
zlib-ng Performance Analysis

Request:
Could you investigate the filter decision logic in libspng? Specifically:

Why the 64-bit build seems to perform worse in selecting filters.
Whether this is related to differences in how zlib-ng interacts with libspng in 32-bit vs. 64-bit environments.
How the default filter heuristic could be improved for 64-bit builds to align with 32-bit behavior.

Full Example:

#include <iostream>
#include <fstream>
#include <vector>
#include <stdexcept>
#include <cmath>
#include <string>
#include <cstring>
#include <chrono>

extern "C" {
#include "/home/adam/spng-install/include/spng.h"
}

int WritePNGCallback(spng_ctx *ctx, void *user, void *src, size_t length)
{
    std::ofstream* out = reinterpret_cast<std::ofstream*>(user);
    if(!out->write(reinterpret_cast<const char*>(src), length))
    {
        return SPNG_IO_ERROR;
    }
    return SPNG_OK;
}

void EncodeRawImageToPNG(const std::string& RawFileName,
                         const std::string& PngFileName,
                         uint32_t Width,
                         uint32_t Height,
                         int DPI)
{
    spng_ctx* ctx = nullptr;
    spng_ihdr ihdr;

    std::ifstream rawFile(RawFileName, std::ios::binary);
    if(!rawFile.is_open()) throw std::runtime_error("Failed to open raw file.");
    rawFile.seekg(0, std::ios::end);
    std::streampos fileSize = rawFile.tellg();
    rawFile.seekg(0, std::ios::beg);

    std::vector<unsigned char> rawBuffer(fileSize);
    if(!rawFile.read(reinterpret_cast<char*>(rawBuffer.data()), fileSize))
        throw std::runtime_error("Failed to read raw file into memory.");
    rawFile.close();

    std::ofstream pngFile(PngFileName, std::ios::binary);
    if(!pngFile.is_open()) throw std::runtime_error("Failed to create/open PNG output file.");

    ctx = spng_ctx_new(SPNG_CTX_ENCODER);
    if(ctx == nullptr) throw std::runtime_error("Failed to create spng context.");

    try
    {
        std::memset(&ihdr, 0, sizeof(ihdr));
        ihdr.width = Width;
        ihdr.height = Height;
        ihdr.bit_depth = 8;
        ihdr.color_type = SPNG_COLOR_TYPE_TRUECOLOR_ALPHA;
        ihdr.compression_method = 0;
        ihdr.filter_method = 0;
        ihdr.interlace_method = SPNG_INTERLACE_NONE;

        int resultCode = spng_set_ihdr(ctx, &ihdr);
        if(resultCode != SPNG_OK)
            throw std::runtime_error(std::string("Failed to set IHDR: ") + spng_strerror(resultCode));

        resultCode = spng_set_option(ctx, SPNG_IMG_COMPRESSION_LEVEL, 1);

        //Remove this comment line after testing.
        //resultCode = spng_set_option(ctx, SPNG_FILTER_CHOICE, SPNG_FILTER_CHOICE_NONE);
        if(resultCode != SPNG_OK)
            throw std::runtime_error("Failed to set compression level to 1.");

        resultCode = spng_set_png_stream(ctx, WritePNGCallback, &pngFile);
        if(resultCode != SPNG_OK)
            throw std::runtime_error("Failed to set PNG stream callback.");

        double ppm = static_cast<double>(DPI) * 39.37;
        int ippm = static_cast<int>(std::round(ppm));
        spng_phys phys;
        std::memset(&phys, 0, sizeof(phys));
        phys.ppu_x = ippm;
        phys.ppu_y = ippm;
        phys.unit_specifier = 1;
        resultCode = spng_set_phys(ctx, &phys);
        if(resultCode != SPNG_OK)
            throw std::runtime_error("Failed to set pHYs chunk.");

        size_t imageSize = static_cast<size_t>(Width) * static_cast<size_t>(Height) * 4;
        if(rawBuffer.size() < imageSize)
            throw std::runtime_error("RAW buffer is smaller than the expected image size.");

        resultCode = spng_encode_image(ctx, rawBuffer.data(), imageSize, SPNG_FMT_RAW, SPNG_ENCODE_FINALIZE);
        if(resultCode != SPNG_OK)
            throw std::runtime_error(std::string("Failed to encode image: ") + spng_strerror(resultCode));
    }
    catch(...)
    {
        spng_ctx_free(ctx);
        pngFile.close();
        throw;
    }

    spng_ctx_free(ctx);
    pngFile.close();
}

int main(int argc, char *argv[])
{
    size_t w = 2480;
    size_t h = 3508;
    const std::string raw_fname(argv[1]);
    const std::string out_fname(argv[2]);
    /* Just assuming a squarish image for now */
    auto t0 = std::chrono::steady_clock::now();
    EncodeRawImageToPNG(raw_fname, out_fname, w, h, 300);
    auto t1 = std::chrono::steady_clock::now();
    auto diff = t1 - t0; 
    double total = std::chrono::duration<double>(diff).count();
    printf("img encode too %lf ms\n", total * 1e3);

    return 0;
}

Raw RGBA image
raw-rgba.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions