Orange components are called widgets. They range from simple data visualization, subset selection, and preprocessing to empirical evaluation of learning algorithms
and predictive modeling.
Visual programming is implemented through an interface in which workflows are created by linking predefined or user-designed widgets, while advanced users can use Orange as a Python library for data manipulation and widget alteration.[5]
Orange is an open-source software package released under GPL and hosted on GitHub. Versions up to 3.0 include core components in C++ with wrappers in Python. From version 3.0 onwards, Orange uses common Python open-source libraries for scientific computing, such as numpy, scipy and scikit-learn, while its graphical user interface operates within the cross-platformQt framework.
The default installation includes a number of machine learning, preprocessing and data visualization algorithms in 6 widget sets (data, transform, visualize, model, evaluate and unsupervised). Additional functionalities are available as add-ons (text-mining, image analytics, bioinformatics, etc.).
Orange consists of a canvas interface onto which the user places widgets and creates a data analysis workflow. Widgets offer basic functionalities such as reading the data, showing a data table, selecting features, training predictors, comparing learning algorithms, visualizing data elements, etc. The user can interactively explore visualizations or feed the selected subset into other widgets.
Canvas: graphical front-end for data analysis
Widgets:
Data: widgets for data input, data filtering, sampling, imputation, feature manipulation and feature selection
Visualize: widgets for common visualization (box plot, histograms, scatter plot) and multivariate visualization (mosaic display, sieve diagram).
Regression: a set of supervised machine learning algorithms for regression
Evaluate: cross-validation, sampling-based procedures, reliability estimation and scoring of prediction methods
Unsupervised: unsupervised learning algorithms for clustering (k-means, hierarchical clustering) and data projection techniques (multidimensional scaling, principal component analysis, correspondence analysis).
Bioinformatics: components for gene expression analysis, enrichment, and access to expression databases (e.g., Gene Expression Omnibus) and pathway libraries.
Data fusion: components for fusing different data sets, collective matrix factorization, and exploration of latent factors.
Time series: widget components for time series analysis and modeling.
Single-cell: support for single-cell gene expression analysis, including components for loading single-cell data, filtering and batch effect removal, marker genes discovery, scoring of cells and genes, and cell type prediction.
Spectroscopy: components for analyzing and visualization of (hyper)spectral datasets.[6]
Survival analysis: add-on for data analysis dealing with survival data. It includes widgets for standard survival analysis techniques, such as the Kaplan-Meier plot, the Cox regression model, and several derivative widgets.
World Happiness: support for downloading socioeconomic data from a database, including OECD and World Development Indicators. Provides access to thousands of country indicators from various economic databases.
Fairness: add-on for evaluation and creation of fair machine learning models without discrimination. Widgets range from computing fairness metrics like statistical parity to post-, pre-, in-processing methods to build fair models.[7]
The program provides a platform for experiment selection, recommendation systems, and predictive modelling and is used in biomedicine, bioinformatics, genomic research, and teaching. In science, it is used as a platform for testing new machine learning algorithms and for implementing new techniques in genetics and bioinformatics. In education, it was used for teaching machine learning and data mining methods to students of biology, biomedicine, and informatics.
Various projects build on Orange either by extending the core components with add-ons or using only the Orange Canvas to exploit the implemented visual programming features and GUI.
In 1996, the University of Ljubljana and Jožef Stefan Institute started development of ML*, a machine learning framework in C++, and Python bindings were developed for this framework in 1997, which, together with emerging Python modules, formed a joint framework called Orange. Over the following years, most contemporary major algorithms for data mining and machine learning were implemented in C++ (Orange's core) or Python modules.
In 2002, first prototypes to create a flexible graphical user interface were designed using Pmw Python megawidgets.
In 2003, the graphical user interface was redesigned and re-developed for Qt framework using PyQt Python bindings. The visual programming framework was defined, and the development of widgets (graphical components of the data analysis pipeline) began.
In 2005, extensions for data analysis in bioinformatics was created.
In 2008, Mac OS X DMG and Fink-based installation packages were developed.
In 2009, over 100 widgets were created and maintained.
In 2009, Orange 2.0 beta was released, offering installation packages on the website based on the daily compiling cycle.
In 2012, a new object hierarchy was imposed, replacing the old module-based structure.
In 2013, a significant redesign of the graphical user interface included a new toolbox and depiction of workflows.
In 2015, Orange 3.0 was released. Orange stores the data in NumPy arrays; machine learning algorithms mostly use scikit-learn.
In 2015, a text analysis add-on for Orange3 was released.
In 2016, Orange released version 3.3. Development scheduled a monthly cycle for stable releases.
In 2016, Orange began development and release of an Image Analytics add-on, with server-side deep neural networks for image embedding [9]
In 2017, a Spectroscopy add-on for the analysis of spectral data was introduced.[10]
In 2017, Geo, an add-on for dealing with geo-location data and visualisation of geo maps was introduced [11]
In 2018, Orange began development and release of an add-on for single-cell data analysis.[12]
In 2019, Orange separated its graphical interface for development as a separate project, orange-canvas-core[13]
In 2020, Orange introduced the Explain add-on with widgets for explaining classification models and regression models, highlighting the strength and contributions specific features make towards predicting a specific class.
In 2022, World Happiness, an add-on for the Orange3 data mining suite, was introduced, providing widgets for accessing socioeconomic data from various databases such as World Happiness Report, World Development Indicators, OECD.
In 2022, Orange extended the Explain add-on with an Individual Conditional Expectation plot and the Permutation Feature Importance technique.
In 2023, Orange introduced the Fairness add-on, including widgets to calculate bias metrics, as well as widgets for pre-, post-, and in-processing methods, allowing the creation of models less susceptible to systematic error due to the vagaries of the data set.
^Janez Demšar; Tomaž Curk; Aleš Erjavec; Črt Gorup; Tomaž Hočevar; Mitar Milutinovič; Martin Možina; Matija Polajnar; Marko Toplak; Anže Starič; Miha Stajdohar; Lan Umek; Lan Žagar; Jure Žbontar; Marinka Žitnik; Blaž Zupan (2013). "Orange: data mining toolbox in Python"(PDF). Journal of Machine Learning Research. 14 (1): 2349–2353.
^Toplak, M.; Birarda, G.; Read, S.; Sandt, C.; Rosendahl, S. M.; Vaccari, L.; Demšar, J.; Borondics, F. (2017). "Infrared Orange: Connecting Hyperspectral Data with Machine Learning". Synchrotron Radiation News. 30 (4): 40–45. Bibcode:2017SRNew..30...40T. doi:10.1080/08940886.2017.1338424. S2CID125273654.
^Sanchez Del Rio, Manuel; Rebuffi, Luca (2017). "OASYS (Or Ange SYnchrotron Suite): An open-source graphical environment for x-ray virtual experiments". In Chubar, Oleg; Sawhney, Kawal (eds.). Advances in Computational Methods for X-Ray Optics IV. p. 28. doi:10.1117/12.2274263. ISBN9781510612334. S2CID117118973.