Basak Guler

Distributed Machine Learning, Federated Learning, Machine Learning in Wireless Networks

Federated learning is a distributed learning framework that allows machine learning in mobile environments, while protecting user privacy and providing robustness against user dropouts. The global model is maintained at a central server, while the training data is kept on the user device. Rather than sending their local datasets to the server, users locally update the global model. The local updates are aggregated in a privacy-preserving protocol, which is called secure aggregation, at the server, which is then used to update the global model. Secure aggregation ensures that the local update of each user is kept private, both from the server and the other users. The global model is then pushed back to the mobile users for inference.

A major bottleneck in scaling federated learning to a large number of users is the overhead of secure model aggregation, which grows quadratically with the number of users. We have proposed a novel secure aggregation framework, named Turbo-Aggregate, that in a network with N users guarantees a secure aggregation overhead of O(NlogN), while tolerating up to a user dropout rate of 50%. Turbo-aggregate employs a multi-group circular strategy for efficient model aggregation, and leverages additive secret sharing and novel coding techniques for injecting aggregation redundancy in order to handle user dropouts while guaranteeing user privacy. Our experiments demonstrate that Turbo-aggregate achieves a total running time that grows almost linear in the number of users, and provides up to 40x speedup over the state-of-the-art protocols in a network with 200 mobile users.

Related publications:

Jinhyun So, Ramy E. Ali, Basak Guler, Jiantao Jiao, A. Salman Avestimehr, Securing Secure Aggregation: Mitigating Multi-Round Privacy Leakage in Federated Learning, 2021.

Jinhyun So, Basak Guler, A. Salman Avestimehr, Turbo-Aggregate: Breaking the Quadratic Aggregation Barrier in Secure Federated Learning, IEEE Journal on Selected Areas in Information Theory: Privacy and Security of Information Systems, 2021.

Jinhyun So, Basak Guler, A. Salman Avestimehr, BREA: Byzantine-Resilient Secure Federated Learning, IEEE Journal in Selected Areas in Communications: Machine Learning in Communications and Networks, 2020.

Secure and Privacy-preserving Machine Learning

How to train a machine learning model in a distributed network while keeping the data private and secure? My research builds fast and scalable frameworks to address this problem. These frameworks keep both the data and the model information-theoretically private, while allowing efficient parallelization of training across distributed workers.

Due to the typically large volume of data and complexity of models, training is a compute and storage intensive task. Furthermore, training is often be done on sensitive data, such as healthcare records, browsing history, or financial transactions, which brings important security and privacy implications. This creates a challenging dilemma. On the one hand, due to its complexity, it is often desirable to outsource the training task to more capable computing platforms, such as the cloud. On the other hand, the training dataset is often sensitive and particular care should be taken to protect its privacy against potential breaches in such platforms. This dilemma gives rise to the following problem, how can we offload the training task to a distributed computing platform, while maintaining the privacy of the dataset?

We introduce a novel framework to enable fast distributed training over a private dataset. Our framework leverages coding and information-theoretic approaches for secret sharing the dataset and model parameters, which reduces the communication overhead and significantly speeds up the training time. As a result, our framework can scale to a significantly larger number of workers, by decreasing the per-worker computation load gradually as more and more workers are added in the system. Experimental evaluations for image classification on the Amazon EC2 cloud demonstrate significant speedup over the baseline protocols, while providing comparable accuracy to conventional logistic regression training.

Related publications:

Jinhyun So, Basak Guler, A. Salman Avestimehr, A Scalable Approach for Privacy-Preserving Collaborative Machine Learning, Conference on Neural Information Processing Systems, NeurIPS 2020.

Jinhyun So, Basak Guler, A. Salman Avestimehr, CodedPrivateML: A Fast and Privacy-Preserving Framework for Distributed Machine Learning, IEEE Journal on Selected Areas in Information Theory: Privacy and Security of Information Systems, 2021.

Scalable and Privacy-Aware Distributed Graph Processing

Graph topologies are important tools for modeling real-world phenomena, such as the topology of the World-Wide-Web, social and biological interactions, or the connectivity patterns in sensor networks. As such, modern applications often require processing large volumes of information that is represented in the form of a graph, such as PageRank, a widely used graph algorithm for ranking webpages in search engines, or semi-supervised learning and graph signal filtering, an important building block of graph neural networks. Distributed computing provides an effective means of scaling-up large-scale graph processes, by distributing the graph storage and computation load across multiple processors (workers) in the cloud.

However, this also requires extensive communication and coordination between the processors, which can take up to 50% of the overall execution time, making the inter-processor communication load a major bottleneck in scalability. I design scalable and privacy-aware distributed computing frameworks for large-scale graph and information processing applications, using coding and information theory principles.

Related publications:

Basak Guler, A. Salman Avestimehr, Antonio Ortega, TACC: Topology-Aware Coded Computing for Distributed Graph Processing, IEEE Transactions on Signal and Information Processing over Networks, vol. 6, pp. 508-525, May 2020.

Basak Guler, A. Salman Avestimehr, Antonio Ortega, A Topology-Aware Coding Framework for Distributed Graph Processing, International Conference on Acoustics, Speech, and Signal Processing, ICASSP'19, Brighton, UK, May. 2019.

Basak Guler, Ajinkya Jayawant, A. Salman Avestimehr, Antonio Ortega, Robust Graph Signal Sampling, International Conference on Acoustics, Speech, and Signal Processing, ICASSP'19, Brighton, UK, May. 2019.

Basak Guler, A. Salman Avestimehr, Antonio Ortega, Privacy-Aware Distributed Graph-Based Semi-Supervised Learning, IEEE International Workshop on Machine Learning for Signal Processing, MLSP'19, Pittsburgh, PA, Oct. 2019.

Network Information Theory

Another focus of my research is understanding the information-theoretic performance limits of multi-user context-aware communication networks. This is inspired by scenarios in which interacting parties are influenced by side information while interpreting the messages, such as external information resources or knowledge bases representing the unique backgrounds, characteristics, or biases. To do so, my research explores the fundamental limits of the amount of information that can be transferred with a fidelity criterion in a multi-user communication network when interacting parties have access to side information.

I investigate the impact of shared or differing knowledge bases on the fundamental performance limits when multiple parties transfer information through a noisy channel.

Related publications:

Basak Guler, Deniz Gündüz, and Aylin Yener,Lossy Transmission of Correlated Sources over a Multiple Access Channel: Necessary Conditions and Separation Results, IEEE Transactions on Information Theory, 64(9), pp. 6081-6097, Sep. 2018.

Basak Guler, Deniz Gündüz, Aylin Yener, On the Necessary Conditions for Transmitting Correlated Sources over a Multiple Access Channel, Proceedings of the IEEE International Symposium on Information Theory, ISIT'17, Aachen, Germany, Jun. 2017.

Basak Guler, Deniz Gündüz, Aylin Yener,On Lossy Transmission of Correlated Sources over a Multiple Access Channel, Proceedings of the IEEE International Symposium on Information Theory, ISIT'16, Barcelona, Spain, Jul. 2016.

Basak Guler, Aylin Yener, Ebrahim MolavianJazi, Prithwish Basu, Ananthram Swami, Carl Andersen,Interactive Function Compression with Asymmetric Priors, Proceedings of the IEEE Data Compression Conference, DCC'16, Snowbird, UT, Mar. 2016.

Basak Guler, Ebrahim MolavianJazi, Aylin Yener,Remote Source Coding with Two-Sided Information, Proceedings of the IEEE International Symposium on Information Theory, ISIT'15, Hong Kong, Jun. 2015.

Basak Guler, Kaya Tutuncuoglu, Aylin Yener,Maximizing Recommender's Influence in a Social Network: An Information-Theoretic Perspective, Proceedings of the IEEE Information Theory Workshop, ITW'15, Jeju Island, Korea, Oct. 2015.

Basak Guler, Aylin Yener,Compressing Semantic Information with Varying Priorities, Proceedings of the IEEE Data Compression Conference, DCC'14, Snowbird, UT, Mar. 2014.

Semantic Communication

Emerging networks such as the Internet of Things (IoT) are designed to facilitate the interaction of humans with intelligent machines. These networks consist of actors with possibly different characteristics, goals, and interests. Such differences can in turn lead to various interpretations of the received information. I design networks that operate under such ambiguous environments, by leveraging the semantic and social features of information transmission. Unlike conventional communication networks, this necessitates taking into account the personal background and characteristics of the interacting parties. For these networks, reliable communication implies that the intended meaning of messages is preserved at reception. In effect, this new generation of networks supports interaction at a level that communicating parties can form social relationships and build trust, which may further affect how the received messages are interpreted. In contrast, communication protocols that operate in the physical layer do not take into account the difference between the meanings of transmitted and recovered messages, but rather are concerned with the engineering problem of reliably communicating sequences of bits to the receiver. These factors together motivate a new approach that molds physical and application layer metrics into one, i.e., a novel performance criterion that takes into account the meanings of the communicated messages. I design mechanisms to achieve this, i.e., how to reliably communicate the meanings of messages through a noisy channel.

An external influential entity, who can influence how the destination perceives the received information, is considered, to model the impact of social influence on how the messages are interpreted. The exact nature of the individual, whether adversarial or helpful, is unknown to the communicating parties. An individual with such influence capability can have a significant impact on information recovery, hence transmission policies should be tailored to take into account the uncertainty in the intentions of such influential entities.

Related publications:

Basak Guler, Aylin Yener, and Ananthram Swami, The Semantic Communication Game, IEEE Transactions on Cognitive Communications and Networking, vol. 4, no. 4, pp. 787-802, Dec. 2018.

Basak Guler, Aylin Yener, Ananthram Swami,The Semantic Communication Game, Proceedings of the IEEE International Conference on Communications, ICC'16, Kuala Lumpur, Malaysia, May 2016.

Basak Guler, Aylin Yener,Semantic Index Assignment, Proceedings of the Sixth International Workshop on Information Quality and Quality of Service for Pervasive Computing (IQ2S'14) in Conjunction with IEEE PERCOM 2014, Budapest, Hungary, Mar. 2014.

Heterogeneous Wireless Networks: Interference Management

A heterogeneous wireless network is an environment that consists of cellular base stations of various sizes, coverage areas, and operating protocols. Some of these are installed and maintained by the mobile operator, such as macrocells (cellular towers), while the others are plug-and-play devices installed by the users, such as femtocells.

This ad-hoc nature of deployment makes interference management across different tiers a challenge for mobile operators. I addressed this problem in a two-tier network that involves femtocells in addition to macrocells. Mobile user devices as well as base stations are deployed with multiple antennas, which creates spatial dimensions that can be used for interference cancellation or data rate increase. To alleviate cross-tier interference, interference received from the macrocell users can be aligned in a small dimensional subspace at multiple femtocells, while simultaneously ensuring that the performance requirements of the macrocell users are satisfied. This can enable coexistence with high data rates even at very high interference levels, when communication would be impossible otherwise.

Related publications:

Basak Guler, Aylin Yener,Uplink Interference Management in Coexisting MIMO Femtocell and Macrocell Networks: An Interference Alignment Approach, IEEE Transactions on Wireless Communications, vol. 13, no. 4, pp. 2246-2257, Apr. 2014.

Basak Guler and Aylin Yener,Selective Interference Alignment for MIMO Cognitive Femtocell Networks, IEEE Journal in Selected Areas in Communications: Cognitive Radio Series, vol. 32, no. 3, pp. 439-450, Mar. 2014.

Basak Guler and Aylin Yener,Selective Interference Alignment for MIMO Femtocell Networks, Proceedings of the IEEE International Conference on Communications, ICC'13, Budapest, Hungary, Jun. 2013.

Basak Guler and Aylin Yener,Interference Alignment for Cooperative MIMO Femtocell Networks, Proceedings of the IEEE Global Communications Conference, Globecom'11, Houston, TX, Dec. 2011.