-
Notifications
You must be signed in to change notification settings - Fork 22
fix(build): Improve build system reliability and flexibility #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
build_and_package.sh
Outdated
| mkdir -p "$OUTPUT_DIR" | ||
| PACKAGE_DIR="$OUTPUT_DIR/$BUILD_NAME" | ||
| rm -rf "$PACKAGE_DIR" | ||
| mkdir -p "$PACKAGE_DIR/include" "$PACKAGE_DIR/lib" | ||
|
|
||
| cp -r include/* "$PACKAGE_DIR/include/" | ||
| cp -r "$BUILD_DIR"/$BUILD_TYPE/*.a "$PACKAGE_DIR/lib/" | ||
| cp -r "$BUILD_DIR"/$BUILD_TYPE/*.so "$PACKAGE_DIR/lib/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can add -DCMAKE_INSTALL_PREFIX=$OUTPUT_DIR in CMAKE_OPTIONS, and run make install after make -j to collect all include files and all .a/.so to $OUTPUT_DIR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| "-DZSTD_HOME=${ORC_ZSTD_ROOT}" | ||
| "-DZLIB_HOME=${ORC_ZLIB_ROOT}" | ||
| "-DPROTOBUF_HOME=${ORC_PROTOBUF_ROOT}" | ||
| "-DProtobuf_ROOT=${ORC_PROTOBUF_ROOT}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in my development environment, the system has a higher incompatible version of protobuf installed (3.9.1). During the build process, ORC unexpectedly uses the system's protoc to generate code, but at the linking stage, it uses the project-specified version of libprotobuf, which leads to undefined reference errors.
Based on my analysis, the issue is that find_package(Protobuf CONFIG) in ORC's FindProtobuf.cmake will prioritize finding the system's protobuf and use the system's protoc. PROTOBUF_HOME is not among CMake's standard search variables, so if only PROTOBUF_HOME is set, find_package() will ignore it and search in system paths instead.
After I set Protobuf_ROOT, the problem was resolved. This issue should be reproducible if you install a higher version of protobuf on the system (e.g., 3.9.1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR improves the build system's reliability and flexibility by adding checksum verification for downloaded dependencies, fixing build configuration issues, and providing a convenient build/package script.
Changes:
- Added checksum verification with automatic skip of already-downloaded dependencies to avoid redundant downloads
- Fixed build configuration by setting Protobuf_ROOT for ORC, replacing undefined output path variables with proper CMAKE_* variables, and adding C++11 ABI control
- Added an automated build and packaging script for simplified workflows
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| third_party/versions.txt | Added package name variables for each dependency to support checksum verification and local file detection |
| third_party/download_dependencies.sh | Implemented SHA256 checksum verification with multi-platform support and automatic skip logic for existing valid files |
| cmake_modules/ThirdpartyToolchain.cmake | Added THIRDPARTY_DIR variable, local file fallback logic, Protobuf_ROOT for ORC, and C++11 ABI flag propagation to external projects |
| cmake_modules/BuildUtils.cmake | Replaced undefined OUTPUT_PATH variables with proper CMAKE_*_OUTPUT_DIRECTORY variables for consistent artifact placement |
| build_and_package.sh | New script providing automated build and packaging workflow with configurable options |
| CMakeLists.txt | Added PAIMON_USE_CXX11_ABI option with validation preventing Lance/Lumina from being enabled when ABI is off, and added CMAKE_RUNTIME_OUTPUT_DIRECTORY definition |
| .gitignore | Added output directory to ignore list |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
build_and_package.sh
Outdated
| fi | ||
| mkdir -p "$BUILD_DIR" | ||
| cd "$BUILD_DIR" | ||
| cmake -DCMAKE_BUILD_TYPE=$BUILD_TYPE -DCMAKE_INSTALL_PREFIX="$PACKAGE_DIR" $CMAKE_OPTIONS .. |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CMAKE_OPTIONS variable should be quoted when passed to cmake to properly handle options with spaces or special characters. Without quotes, word splitting could cause issues.
build_and_package.sh
Outdated
| mkdir -p "$BUILD_DIR" | ||
| cd "$BUILD_DIR" | ||
| cmake -DCMAKE_BUILD_TYPE=$BUILD_TYPE -DCMAKE_INSTALL_PREFIX="$PACKAGE_DIR" $CMAKE_OPTIONS .. | ||
| make -j"$(nproc 2>/dev/null || echo 4)" |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script uses 'nproc' without quoting in the make command. While the fallback to 4 is provided, it would be more robust to quote the entire command substitution to avoid potential issues.
| make -j"$(nproc 2>/dev/null || echo 4)" | |
| JOBS=$(nproc 2>/dev/null || echo 4) | |
| make -j"$JOBS" |
third_party/download_dependencies.sh
Outdated
|
|
||
| echo "Downloading ${url} to ${out}..." | ||
| wget --continue --output-document="${out}" "${url}" || \ | ||
| (echo "Failed to download ${url}" 1>&2; exit 1) |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message says "Failed to download" but should be "Failed downloading" to match the original error message style and maintain consistency with the rest of the codebase.
| (echo "Failed to download ${url}" 1>&2; exit 1) | |
| (echo "Failed downloading ${url}" 1>&2; exit 1) |
| fi | ||
|
|
||
| echo "Downloading ${url} to ${out}..." | ||
| wget --continue --output-document="${out}" "${url}" || \ |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The --quiet flag was removed from wget. This will make the download output more verbose. Consider whether this was intentional or if the --quiet flag should be retained for cleaner build logs.
| wget --continue --output-document="${out}" "${url}" || \ | |
| wget --quiet --continue --output-document="${out}" "${url}" || \ |
build_and_package.sh
Outdated
| BUILD_TYPE="release" | ||
| shift | ||
| ;; | ||
| -d|--debug) | ||
| BUILD_TYPE="debug" |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BUILD_TYPE variable should use proper case ('Release' or 'Debug') instead of lowercase ('release' or 'debug') to follow CMake conventions and ensure consistent behavior across all platforms and generators.
third_party/download_dependencies.sh
Outdated
| openssl dgst -sha256 "${file}" | cut -d' ' -f2 | ||
| else | ||
| # sha256sum and shasum have similar output format | ||
| ${checksum_cmd} "${file}" | cut -d' ' -f1 |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unquoted expansion of checksum_cmd relies on word splitting to work correctly. While this works in this specific context, it's fragile and could break if the command path contains spaces. Consider using an array or restructuring the code to call commands directly for better robustness.
build_and_package.sh
Outdated
| @@ -0,0 +1,129 @@ | |||
| #!/usr/bin/env bash | |||
| # | |||
| # Copyright 2024-present Alibaba Inc. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2026-present
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
third_party/versions.txt
Outdated
|
|
||
| PAIMON_ZLIB_BUILD_VERSION=1.3.1 | ||
| PAIMON_ZLIB_BUILD_SHA256_CHECKSUM=9a93b2b7dfdac77ceba5a558a580e74667dd6fede4585b91eefb60f03b72df23 | ||
| PAIMON_ZLIB_PKG_NAME=zlib-1.3.1.tar.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reduce redundancy, consider deriving *_PKG_NAME from *_BUILD_VERSION (e.g., zlib- $ {PAIMON_ZLIB_BUILD_VERSION}.tar.gz) instead of hardcoding the version twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done

Purpose
Tests
None
API and Format
No impact on API or format
Documentation
No new features introduced