Akan, TaymazAlp, SaitBhuiyan, Md. ShenuarinHelmy, TarekOrr, A. WayneBhuiyan, Md. Mostafizur RahmanBhuiyan, Mohammad Alfrad Nobel2026-03-262026-03-2620252948-29252948-293310.1007/s10278-024-01336-y2-s2.0-105007745569https://doi.org/10.1007/s10278-024-01336-yhttps://hdl.handle.net/20.500.14901/2467Bhuiyan, Mostafizur Rahman/0009-0009-1916-7170; Alp, Sait/0000-0003-2462-6166; Akan, Taymaz/0000-0003-4070-1058;Heart disease is the leading cause of death worldwide, and cardiac function as measured by ejection fraction (EF) is an important determinant of outcomes, making accurate measurement a critical parameter in PT evaluation. Echocardiograms are commonly used for measuring EF, but human interpretation has limitations in terms of intra- and inter-observer (or reader) variance. Deep learning (DL) has driven a resurgence in machine learning, leading to advancements in medical applications. We introduce the ViViEchoformer DL approach, which uses a video vision transformer to directly regress the left ventricular function (LVEF) from echocardiogram videos. The study used a dataset of 10,030 apical-4-chamber echocardiography videos from patients at Stanford University Hospital. The model accurately captures spatial information and preserves inter-frame relationships by extracting spatiotemporal tokens from video input, allowing for accurate, fully automatic EF predictions that aid human assessment and analysis. The ViViEchoformer's prediction of ejection fraction has a mean absolute error of 6.14%, a root mean squared error of 8.4%, a mean squared log error of 0.04, and an R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${R}{2}$$\end{document} of 0.55. ViViEchoformer predicted heart failure with reduced ejection fraction (HFrEF) with an area under the curve of 0.83 and a classification accuracy of 87 using a standard threshold of less than 50% ejection fraction. Our video-based method provides precise left ventricular function quantification, offering a reliable alternative to human evaluation and establishing a fundamental basis for echocardiogram interpretation.eninfo:eu-repo/semantics/openAccessDeep LearningVision TransformersVideo AnalysisEchocardiographyHeart FailureLeft Ventricular Ejection FractionCardiovascular DiseaseViviechoformer: Deep Video Regressor Predicting Ejection FractionArticle