x86: add SSSE3 SIMD for generate_grain_uv_{422,444}
x86: add AVX2 SIMD for generate_grain_uv_{422,444}
const correctness in thread_task
Make insert_border src pointer const
crossfiles: android: Remove misleading comment
crossfiles: android armv7: Target API 16
crossfiles: android: Remove hardcoded c_args
ci: Add android jobs with artifacts on tags
ci: Add android configs
package: Add android crossfiles
Add crosscompile config files for 32, 64-bit Windows and 32-bit Linux
Add a cast to avoid MSVC warning
meson/android: undefine _FILE_OFFSET_BITS if fseeko is not available
Merge fix_mv{_int,}_precision() into get_gmv_2d()
Use union refmvs_pair { mv mv[2]; uint64_t n; } for MV pairs
Rewrite refmvs.c
lib: restructure the internal implementation of the decode API
android/arm: do not use fseeko in library code
headers: partially revert a recent change to Dav1dLogger doxy
CI: Only deploy documentation for master branch
Fix a typo, only need two l!
headers: add missing doxy to some Dav1dSettings fields
headers: split some public fields into separate lines and document them
CLI: Remove additional space
CLI: Remove avx512 from help text
CI: add examples job build
examples: fail when SDL is not found
CI: Add documentation CI job
CI: Deduplicate and template jobs
doc: search for dot as it's needed to build doxygen documentation
Update NEWS for 0.6.0
arm64: mc: NEON implementation of w_mask for 16 bpc
CI: run a selection of jobs on a node with avx2
x86: Fix crash in AVX2 cdef_filter with <32-byte stack alignment
arm64: mc: NEON implementation of blend for 16bpc
arm: mc: Optimize blend_v
arm64: mc: Treat the stride as a full 64 bit (potential signed) value in blend_8bpc_neon
arm64: mc: Fix indentation
arm64: mc: Use more intuitive lane specifications for loads/stores
Update NEWS for 0.6.0
CI/armv7: use `linux32 meson ...` to allow running on aarch64
arm64: loopfilter: NEON implementation of loopfilter for 16 bpc
arm: loopfilter: Prepare for 16 bpc
arm: loopfilter: Fix a comment
fuzzing: link the fuzzing binaries as C++
fuzzing: split the fuzzing targets to their own meson.build file
x86: Add mc w_mask 4:4:4 AVX-512 (Ice Lake) asm
x86: Add mc w_mask 4:2:2 AVX-512 (Ice Lake) asm
x86: Add mc w_mask 4:2:0 AVX-512 (Ice Lake) asm
x86: Add mc avg/w_avg/mask AVX-512 (Ice Lake) asm
x86: optimize cdef_filter_{4x{4,8},8x8}_avx2
x86: add a seperate fully edged case to cdef_filter_avx2
checkasm: Improve the cdef input randomization algorithm
cli: Replace malloc + memset(0) with calloc in input.c
cli: remove init_[de]muxers() functions
Replace malloc+memset(0) with calloc
CI: update aarch64 docker image to buster with meson 0.49
arm: cdef: Do an 8 bit implementation for cases with all edges present
arm32: cdef: Fix a typo for consistency
cli: Implement line buffering in print_stats()
arm: cdef: Remove leftover unused labels and macro parameters
arm64: looprestoration: NEON implementation of SGR for 10 bpc
arm64: looprestoration: Prepare for 16 bpc by splitting code to separate files
arm: looprestoration: Add 8bpc to existing function names, add HIGHBD_*_SUFFIX
looprestoration: Add a bpc parameter to the init func
arm: looprestoration: Improve scheduling in box3/5_h slightly
arm: Use int16_t for the tmp intermediate buffer
arm: looprestoration: Fix a comment
NEWS: Official naming is AVX2, not AVX-2
arm64: mc: Reduce the width of a register copy
arm64: mc: Use two regs for alternating output rows for w4/8 in avg/w_avg/mask
arm64: mc: Simplify avg/w_avg/mask by always using the w16 macro
Update NEWS for 0.6.0
arm64: mc: NEON implementation of warp for 16 bpc
arm64: cdef: Add NEON implementations of CDEF for 16 bpc
arm: cdef: Prepare for 16bpc
x86: Add cdef_filter_4x4 AVX-512 (Ice Lake) asm
Reorder the Dav1dFrameHeader struct to fix alignment issues
arm64: looprestoration: NEON implementation of wiener filter for 16 bpc
arm: looprestoration: Fix the wiener C wrapper function for high bitdepths
arm: looprestoration: Prepare for 16bpc wiener filter by adding _8bpc to function names
arm: looprestoration: Clarify a comment
arm64: mc: NEON implementation of put/prep 8tap/bilin for 16 bpc
Update README
x86/msac: add an avx2 version for msac_decode_symbol_adapt16
msac: make symbol_adapt16 a function pointer on x86_64
arm64: mc: NEON implementation of avg/mask/w_avg for 16 bpc
arm: mc: Prepare the init file for higher bitdepths
arm: Make the existing 8bpc assembly only built if 8bpc is enabled
x86: Avoid cmov instructions that depends on multiple flags
x86: Add miscellaneous minor scalar optimizations
x86: Use unsigned pointer comparisons
Rework the CDEF top edge handling
checkasm: Fix missing shift in high bit-depth cdef_filter test
Avoid masking the lsb in high bit-depth stride calculations
checkasm: Increase buffer alignment to 64-byte on x86-64
arm: cdef: Add special cased versions for pri_strength/sec_strength being zero
arm: cdef: Fix some comment typos