An Efficient and Anonymous Multidimensional Data Aggregation Scheme Based on Fog Computing for Smart Grids

Abstract

With the development of smart grid and the diversification of their applications, privacy-preserving multi-dimensional data aggregation has been widely studied because it can analyze users’ electricity consumption more deeply. However, previous multidimensional data aggregation schemes suffer from heavy computational operations either to encrypt or decrypt the data. Additionally, existing fault-tolerant mechanisms require aggregation nodes to identify the IDs of smart meters, which leads to the risk of user privacy leakage. In this paper, we propose an efficient and anonymous multidimensional data aggregation scheme, called EAMA. In the proposed scheme, multidimensional data are encrypted by the Paillier encryption scheme of lower modular exponentiation, which reduces the computational cost. Moreover, to protect users’ privacy, smart meters upload data using pseudonyms in a fog computing-based architecture, which prevents aggregation nodes from accessing smart meters’ real IDs. Furthermore, a key challenge in the fault-tolerant phase is to eliminate blind factors, which EAMA effectively addresses without needing the real IDs of smart meters. Performance analysis demonstrates that EAMA achieves lower computational costs and satisfies security requirements.

Share and Cite:

Chen, Y. , Fan, J. and Zhang, Y. (2025) An Efficient and Anonymous Multidimensional Data Aggregation Scheme Based on Fog Computing for Smart Grids. Journal of Computer and Communications, 13, 156-175. doi: 10.4236/jcc.2025.133011.

1. Introduction

As an advanced power system, the smart grid (SG) makes use of contemporary information and communication technology [1] [2] to facilitate real-time data exchange and communication, increasing efficiency for better energy management. By integrating traditional power grid infrastructures with advanced technologies such as cloud computing [3], fog computing [4] and mobile edge computing [5], the smart grid enables real-time data collection, transmission, and processing. This integration facilitates communication in both directions between companies and consumers, improving the efficiency, affordability, and stability of power management. A fundamental element of the SG is the smart meters (SMs) [6], which are positioned at the consumer’s location to gather real-time electricity usage data and periodically report the data to the control center (CC). SMs are considered the most critical element on the consumer side, since they allow users to report their real-time energy usage actively. Based on the data, various applications are designed to enable the CC to predict power demand, adjust power generation, implement dynamic pricing strategies, and optimize overall grid management, etc. In order to give other organizations relevant data for commercial advertising or decision-making in the energy sector, CC could potentially carry out more statistical research [7]. Despite these advantages, the smart grid faces various security and privacy challenges. For example, by analyzing real-time power consumption data, malicious actors could potentially uncover users’ habits and activities, leading to leak privacy. Besides, the aggregated data must be correct, even if certain SMs do not submit data of their own. Consequently, it is essential to adopt robust encryption technologies to safeguard user privacy.

Privacy-preserving data aggregation [8] has been extensively recognized as an effective approach for protecting data security in smart grids. To do this, SMs can employ homomorphic encryption (HE), enabling them to encrypt electrical usage data through Paillier Homomorphic Encryption [9]. This allows the aggregator gateway (AG) to safely aggregate the data that is encrypted while ensuring that the CC receives the total from SMs in a manner that preserves privacy [10]-[12]. Earlier aggregation methods primarily employed one-dimensional data aggregation, which was effective but lacked the fine-grained data required for a more detailed energy consumption analysis. Recent schemes have introduced multidimensional data aggregation techniques in smart grids, offering significant improvements over one-dimensional methods. Multidimensional aggregation enables the collection of electricity consumption data across various appliance categories, such as wall lamp, refrigerators, air conditioners, and dryers. The more fine-grained the data, the more effective the subsequent analysis becomes. However, multidimensional aggregation often involves complex cryptographic operations to ensure data encryption and decryption, adding to the computational complexity of the process. Traditional aggregation schemes rely on low-performance intermediate nodes that are often connected to multiple end nodes, limiting their ability to efficiently handle such cryptographic operations. To address this challenge, fog node (FN) [13] has been introduced in smart grids to enhance communication, computation, and storage capabilities, optimizing data aggregation. This transition to fog computing enhances the efficiency and scalability of data aggregation procedures.

As the data dimensions increase, the communication costs grow. A critical challenge is to minimize communication costs while maintaining user privacy. Previous aggregation schemes have employed masking technology [14] [15] to mask individual users’ electricity consumption data, which reduces computational and communication overhead. However, these schemes fall short in terms of fault tolerance, as they require retransmission of billing reports in the event of node failures. Fault tolerance is essential for ensuring the robustness of data aggregation in smart grids. When SMs temporarily fail, preventing them from reporting data to FNs, fault tolerance mechanisms can help maintain data integrity. Some mechanisms have been suggested to achieve fault tolerance in response to the challenges presented by faulty SMs. User data is encrypted via a modified variant of the Paillier cryptosystem [16]. However, as the quantity of faulty SMs increases, the decryption expense raises substantially. Similarly, the framework suggested by Boudia et al. [17] incorporates a fault-tolerant method to guarantee the accuracy of the final aggregated data retrieved at the CC, even though the failure of certain nodes. This scheme employs a masking technique to protect individual user data, where the masks are designed to cancel each other out during aggregation. However, this approach requires the FN to know the real identities of the SMs, which introduces privacy concerns.

To solve the above problems, we propose EAMA, an efficiency and anonymous multidimensional data aggregation scheme based on fog computing for smart grids. This paper’s primary achievements can be succinctly stated as below.

1) We have improved the Paillier cryptosystem by designing an encoding function that encodes multiple data items into a single ciphertext. Using this encoding function, the CC successfully obtains the aggregated data in the ciphertext. Additionally, the most computationally resource-intensive operation in Paillier encryption is the higher-order exponentiation of r n , which we optimize to effectively reduce the computational overhead.

2) Since the data aggregator, the FN, is honest-and-curious, so we propose a bilinear pair-based batch anonymization authentication algorithm to efficiently verify users’ anonymity and data integrity.

3) We offer a fault-tolerant approach for ensuring the accurate recovery of the final aggregated data at the CC. When an SM fails, our mechanism solves the problem of blind factor elimination without requiring the real ID of the faulty SM.

The subsequent sections of this article are structured as follows. Section II introduces related work. Section III introduces preliminary knowledge. Section IV introduces the system design of the EAMA, followed by Section V, which introduces the proposed scheme. Section VI introduces the system analysis of the EAMA, and Section VII discusses performance evaluation. Finally, Section VIII concludes the paper.

2. Related Work

In recent years, various data aggregation methods have been introduced for smart grids. Among these, the work presented in [18] introduced a secure in-network aggregation approach specifically designed for smart grid environments. This approach leverages HE to ensure that users’ private data remain undisclosed to intermediate aggregator nodes. As smart grids evolve, there is an increasing necessity for CC to obtain detailed data for enhanced services and to optimize demand response strategies. At the same time, it is crucial to preserve the privacy of individual users. HE emerged as a particularly advantageous approach for encrypted data aggregation owing to its homomorphic characteristics, which allow CC to execute statistical analyses effectively on encrypted data, thereby preserving the secrecy of user information.

In 2012, Lu et al. [19] introduced a multidimensional data aggregation framework utilizing the Paillier HE encryption method. Their technique employed a super-increasing sequence to facilitate CC’s computation of the sum of diverse power usage data kinds. However, super-increasing sequences become less efficient for data packing when the dimensionality is high. To tackle this issue, Li et al. [20] introduced a multi-subset data aggregation strategy utilizing the Paillier homomorphic encryption algorithm, incorporating two super-increasing sequences. This design allowed the aggregation of data from different ranges, enabling CC to obtain more granular insights, such as the sum of consumption within specific regions and the consumer’s count in each range. Despite these advancements, as noted in [21], the scheme restricts each subset of SMs to predefined data ranges, limiting the flexibility and utility of the aggregated data. Boudia et al. [22] proposed a secure multidimensional data aggregation scheme based on elliptic curves and utilizing ElGamal homomorphic encryption (HE) with multiple public keys. Their approach avoided complex encryption operations during data transmission, reducing both computational and communication overhead. However, the scheme required additional elliptic curve scalar multiplication operations for each dimension via the security module, which imposed a significant computational burden. In contrast, Zuo et al. [23] employed a super-incremental sequence combined with the ElGamal HE technique for multidimensional data aggregation. Their technique devised two categories of super-increasing sequences: one for computing the aggregate electrical usage of identical types among all users, and the other for verifying the total amount of individuals whose electrical usage resides inside specified intervals.

The data aggregation schemes proposed in [14] [15], and [24] employ masking techniques to conceal individual users’ electricity usage data. By aggregating all masks, the values cancel each other out, effectively preserving user privacy while providing the overall power usage data. These schemes are notable for their low computation and communication overhead. However, they lack fault tolerance, necessitating additional billing reports to be transmitted in the event of system failures. Zhang et al. [25] established a privacy-preserving billing mechanism that integrates ElGamal’s multidimensional data aggregation algorithm with an anonymous user identity design. This method efficiently safeguards against collusion attacks perpetrated by any two individuals within the system. Additionally, the ciphertexts generated by ElGamal encryption are typically smaller than those produced by Paillier encryption, making ElGamal more efficient in terms of ciphertext storage for multidimensional data. Boudia et al. [17] employed the HE approach into a fog computing architecture to encode multidimensional data. For efficient authentication, their scheme employed a batch authentication technique. Although their method, known as ESMA, is fault tolerant, it requires blind factorial updates for all functioning SMs.

To address these limitations, we introduced encoding functionality into EAMA, utilizing the Paillier homomorphic encryption (HE) technique to efficiently construct and secure multidimensional data. EAMA employs a pseudonymization technique, enabling the CC to read and process aggregated data without requiring blind factor updates for all functioning SMs, even in cases where some SMs are non-functional. This ensures that the CC can continue processing aggregated reports, maintaining fault tolerance and operational efficiency.

3. Preliminaries

3.1. Bilinear Pairing Maps

A bilinear pairing map e ˜ : G 1 × G 1 G 2 is defined on the elliptic curve E on the finite field F p , where p is a large prime. In this context, G 1 denotes an additive cyclic group of order q derived from the elliptic curve E , while G 2 represents a multiplicative cyclic group of identical order q . The bilinear pairing map exhibits the following properties:

1) Bilinearity: For every P,V G 1 and a,b q * , it holds that e( aP,bV )=e ( P,V ) ab .

2) Non-degeneracy: There are two elements P,V G 1 such that e( P,V )1 , where 1 denotes the identity element of G 2 .

3) Computability: For any P,VG , there exists an efficient algorithm to compute e( P,V ) .

3.2. Optimization of the Paillier Cryptosystem for Higher Order Polynomial Operations

The Paillier cryptosystem [9], after optimizing the power operation g m in its original encryption process, encounters its most resource-intensive computation in the evaluation of the higher-order power function r n . In 2010, Ivan Damgård [26] proposed an optimization strategy to streamline the calculation of r n , demonstrating that this enhancement preserves the security level of the original Paillier algorithm. Therefore, the security of the modified Paillier encryption algorithm will be analyzed by imitating the original Paillier encryption algorithm.

1) Key Generation: To ensure security, it is required that pq3mod4 , and gcd( p1,q1 )=2 .

Compute λ= ( p1 )( q1 ) 2 . Select a random number x such that x n * , and calculate h= x 2 modn . Choose a natural number s . In the original Paillier scheme, this corresponds to setting s=1 . Compute h s = h n s mod n s+1 . The optimized public key is defined as: PublicKey=( n, h s ) and the private key is: PrivateKey=( λ,μ ) .

2) Encryption: Generate a random number α , where α 2 [ k/2 ] , and k is the key length. The optimized encryption formula is: c=( n×m+1 )× h s α mod n s+1 . By choosing αn , the computational efficiency of h s α is significantly improved compared to r n in the original encryption process.

3) Decryption: To decrypt the ciphertext c , the plaintext m is recovered using the equation: m=D( c )=L( c λ mod n 2 )μmodn , where L( u )= u1 n is the decryption function used in the Paillier cryptosystem.

4. System Design

4.1. System Model

Our proposed method emphasizes the secure aggregation of sensor data within a Fog computing-based Smart Grid system while ensuring the privacy of sensitive information. Our system model involves four participating entities: a group of SMs, S M j ={ S M 1j ,S M 2j ,,S M nj } located at the network edge; FNs situated near smart devices; a Remote CC; and a Key Generation Center (KGC), as depicted in Figure 1.

Figure 1. System model.

1) SMs: SMs, denoted as S M ij , are terminal devices within the Internet of Things (IoT) network. The meters in question possess sensors, computational parts, to gather real-time data from their environment. The collected data is periodically reported to the nearest FN for further processing.

2) FNs: FNs are positioned nearby smart devices to enable communication between S M ij and the CC. FN offers critical services, including data validation, the process, and storage for the CC. In our suggested method, the FN acquires data from S M ij within its jurisdiction, authenticates and consolidates the data, and subsequently transmits the aggregated ciphertext to the CC for additional examination.

3) CC: The CC is tasked with producing system parameters during initialization and analyzing the aggregated data uploaded by the FN. The CC is crucial in overseeing and administering the system’s overall functionality, utilizing the consolidated data obtained from the FNs.

4) KGC: The Key Generation Center (KGC) functions as a reliable intermediary in the cryptosystem, tasked with distributing public system parameters and private keys for SMs and CC, while also producing pseudonyms for each SM to safeguard user identity privacy.

4.2. Design Goals

Our objective is to introduce an effective and anonymous privacy-preserving multidimensional data aggregation scheme for SG based on Fog Computing. The objective is to safeguard critical information transmitted by smart terminal devices while maintaining user privacy. The subsequent objectives must be accomplished:

1) Security: The proposal must ensure the secrecy, integrity, and authenticity of data transition among system entities. Outside adversaries ( A ) must be denied access to decrypted data. Any modifications to messages should be detectable, and the authentication of transmitting data, along with the verification of legal entity identities, should be performed by FNs and the CC.

2) Privacy Protection: User privacy is paramount in SG. Decryption keys are known only to the CC. Ciphertext computations are performed exclusively by the FN. KGC is aware of users’ real identities, ensuring isolation between keys, ciphertext, and identity. This isolation guarantees that total power consumption can be obtained without revealing user privacy. Individual privacy in IoT applications is crucial. In this proposal, individual privacy is an unrevealed secret to A . Even if there is collusion between the FN and CC, individual privacy is protected when user S M ij employs anonymous privacy encryption.

3) Fault Tolerance: In Smart Grids, S M ij may experience communication failures. The proposal must include fault tolerance mechanisms to ensure correct aggregation and decryption in the event of certain S M ij failures.

4) Performance: Performance is crucial for both SMs and FNs to meet the practical demands of data aggregation in Smart Grids. This requires keeping computation and communication costs as low as possible to ensure system efficiency. Performance is essential for data aggregation in extensive SG, facilitating the prompt and efficient processing of significant data volumes.

5. Proposed Scheme

In this section, we present EAMA for Multidimensional Data Aggregation. Table 1 presents a compilation of acronyms and symbols utilized in this article, accompanied by their definitions. The scheme consists of five main components:

Table 1. Notations.

Notation

Definition

CC

The CC

FN

The FN

SM

The SM

n

The modulus defined as n= p 0 q 0

g

The generator of the group n 2

G 1

An additive group

G 2

A multiplicative group

( n, h 1 )

The public key pair

( λ,μ )

The private key pair

e

A bilinear pairing

H

A secure hash function, H: { 0,1 } * G 1

H 1

A secure hash function H 1 : { 0,1 } * × G 1 G 1

w j

The number of SM covered by F N j

a ij

The secret share of S M ij

s k ij

The secret key of S M ij

p k ij

The public key of S M ij

m ij,l

The data type l of S M ij

b ij,l

The encoded representation of m ij,l

l

The total count of data categories

2 z

The upper limit for a data category’s value

5.1. System Initialization

Based on the security parameter κ , KGC is required to generate two κ -bit prime numbers, p 0 and q 0 , which satisfy the conditions: p 0 q 0 3( mod4 ) , gcd( p 0 1, q 0 1 )=2 . The KGC subsequently chooses two large κ -bit primes, p 0 and q 0 , and calculates the public key of the Paillier encryption system as n= p 0 q 0 , g=1+n . The associated private key is expressed as. Subsequently, the λ= ( p 0 1 )( q 0 1 ) 2 KGC constructs a function L( u )= u1 n and computes μ as μ= ( L( g λ mod n 2 ) ) 1 modn A random number x is then chosen, where x n * , and h= x 2 modn . A natural number s is chosen, and for the original Paillier system, s=1 . For h s = h n s mod n 2 , we have h 1 = h n mod n 2 . The public key is denoted as the tuple ( n, h 1 ) , whereas the associated private key is represented by the tuple ( λ,μ ) . The KGC defines a bilinear pairing map e: G 1 × G 1 G 2 , where G 1 and G 2 are two multiplicative cyclic groups of identical order p , and P serves as the generator of G 1 . The KGC further establishes four collision-resistant hash functions: H: { 0,1 } * G 1 and H 1 : { 0,1 } * × G 1 G 1 .

The KGC establishes the upper limit of FNs at J , and defines S M ij { 1j,2j,,Nj } as the indexed collection of operational SMs. A pseudo-random number producer generates N×J secret shares a ij n * , where i=1,,N , and j=1,,J . The KGC then computes a 0j :

a 0j = ijS M ij a ij modn,wherej=1,,J (1)

5.2. Registration

1) S M ij Registration and Pseudonym Selection: Each SM S M ij produces its unique identity I D ij , randomly chooses a private key s k ij q * , and derives its public key as p k ij =s k ij P , with P representing the group’s generator. The pseudonym PI D ij is computed as: PI D ij = H 1 ( RI D ij ||s k ij ) The signature σ ¯ ij is then calculated as: σ ¯ ij = H 1 ( PI D ij p k ij TS )s k ij The SM S M ij sends { RI D ij ,PI D ij ,p k ij } to KGC along with the timestamp TS .

Upon receiving the registration request, the KGC first verifies the signature:

σ ¯ ij P= H 1 ( PI D ij p k ij TS )p k ij

If the signature is valid, the KGC generates a list of pseudonyms L ij based on the real identity RI D ij . The pseudonym list corresponding to PI D ij of S M ij is as follows: L ij ={ PI D ij1 ,PI D ij2 ,,PI D ijl , } .

During the l -th anonymous update cycle, the KGC selects a random number z l q * (where l=1,2,,N ) to modify the pseudo-identity of S M ij . The updated pseudonym is computed as: PI D ijl =PI D ij H 1 ( PI D ij z l P ) The new signature σ ijl is calculated as:

σ ijl =H1( PI D ijl p k ij   TS )s k ij (2)

The KGC verifies the following equation to ensure consistency:

H1( PI D ij p k ij TS )p k ij = σ ijl P (3)

If the formula is satisfied, the tuple { PI D ijl ,p k ij , σ ijl } is recorded. After successful verification and registration, the KGC sends { PI D ijl ,p k ij , σ ijl } to the FN and CC for further processing.

2) FN Registration: FN is registered, randomly selected s k j Z q * as the private key, and computed the public key p k j =s k j P , where { I D j ,p k j } sent to KGC.

3) CC Registration: CC randomly selects s k CC Z q * as the private key and computes the public key p k CC =s k CC P . Then, CC sends { I D CC ,p k CC } to KGC.

At the end of this phase, KGC publishes the system parameters as

Γ=( e, G 1 , G 2 ,p,P,n,g, h 1 ,z,H, H 1 ,p k CC , { p k j } 1jJ , { p k ij } 1iN,1jJ )

5.3. Report Generation

During the time period T , each S M ij encrypts the l -dimensional data ( m ij,1 , m ij,2 ,, m ij,l ) and simultaneously creates the corresponding signature in Figure 2.

Figure 2. Data operation flowchart.

Step 1: S M ij encodes ( m ij,1 , m ij,2 ,, m ij,l ) into the binary string ( b ij,1 , b ij,2 ,, b ij,l ) where each of the b ij,d = ( m ij,d ) 2 0 z( d1 ) ,d=1,2,,l and assigns its private information as m ij = d=1 l b ij,d . B ij = b ij,1 + b ij,2 ++ b ij,l . Then encode, S M ij to compute B ij ¯ .

B ij ¯ = B ij + a ij modn (4)

Step 2: In the preceding Paillier encryption method, S M ij selects a random number r ij n * and calculates the encrypted content as

C ij = g B ij ¯ r ij n mod n 2 (5)

In our Paillier cryptosystem, randomly generate α ij , α ij Z 2 [ k/2 ] , where k is the key length

C ij = g B ij ¯ g α ij mod n 2 = (n+1) B ij ¯ h 1 α ij mod n 2 =( n B ij ¯ +1 ) h 1 α ij mod n 2 =( n B ij ¯ +1 ) h 1 α ij mod n 2 =( n( B ij + a ij modn )+1 ) h 1 α ij mod n 2 (6)

Step 3: S M ij Utilize the private key s k ij to generate the signature as described below. σ ij =s k ij H( C ij ||TS||I D j ||PI D ijl ) where TS means the present timestamp.

Step 4: S M ij transmits the verified data report information ( C ij ,TS, σ ij ,PI D ijl ) to the corresponding F j .

5.4. Data Aggregation

1) All-Inclusive Data Aggregation: During the period T , we posit the occurrence of certain SM failures within the SG, and use S M j { PI D 1jl ,PI D 2jl ,,PI D Njl } to denote the index set of working SMs. Let w j =| S M j | describe the quantity of operational SMs under F j , and let t signify the minimum threshold of the effective sample size. When w j t , verified data report messages (e.g., ( C ij ,TS, σ ij ,PI D ijl ) ) are received from legitimate SMs, F j conducts a batch validation by verifying the subsequent calculation:

( e( ijS M j σ ij ,P )= ijS M j e ( H( C ij TSI D j PI D ijl ),p k ij )

The FN j will initially conduct bulk verification to authenticate the received signature by determining if the subsequent equation is satisfied:

e( ijS M j σ ij ,P )= ijS M j e( H( C ij TSI D j PI D ijl ),p k ij ) = ijS M j e( H( C ij TSI D j PI D ijl ),s k ij P ) = ijS M j e( s k ij H( C ij TSI D j PI D ijl ),P ) = ijS M j e( σ ij ,P ) (7)

e( P, i=1 w j σ ij )= i=1 w j e( p k ij ,H( C ij I D ij TS ) ). (8)

Batch verification decreases the quantity of pairing processes from w j to w j +1 . Upon validation, FN j aggregates all encrypted data and transmits the aggregated data to CC. Execute the subsequent steps.

Step 1: FN j aggregates w j encrypted ciphertexts into

C j = i=1 w j C ij mod n 2 = i=1 w j g B ij ¯ h 1 α ij mod n 2 = g i=1 w j B i ¯ ( i=1 w j h 1 α ij )mod n 2 = g i=1 w j ( B ij + a ij )modn ( h 1 i=1 w j α ij )mod n 2 =( 1+n( i=1 w j ( B ij + a ij )modn ) )( h 1 i=1 w j α ij )mod n 2 (9)

Step 2: The function FN j generates the signature utilizing its private key s k j in the following manner.

σ j =s k j H( C j TSI D j ) (10)

where TS denotes the recent timestamp.

Step 3: FN j sends a full report to CC containing

{ C j ,TS, σ j ,I D j } (11)

If some SMs break down, FN j will not receive the corresponding packets.

2) Fault-Tolerant Data Aggregation: When certain SMs are unable to transmitting data, FN j will not obtain the relevant packets. Let S M j { PI D 1jl ,PI D 2jl ,,PI D Njl } denote the collection of all operational SM units in FN j , and let S M j represent the subset of defective SM devices ( S M j S M j ). Consequently, we get

a 0j ijS M j /S M j a ij modn. (12)

Consequently, this phenomenon will directly influence the accuracy of the ultimate decryption result. The entity FN j must transmit the set S M j to the KGC. Subsequently, the KGC generates a new a 0j based on S M j and transmits it to the CC.

5.5. Data Reading

1) All-Inclusive Data reading: Upon obtaining a whole report from FN j , CC initially authenticates the signature in accordance with the subsequent equation:

e( P, σ j )=e( p k j ,H( C j TSI D j ) ) (13)

When the equation is satisfied, it indicates that the signature is legitimate. Upon verifying the authenticity, CC decrypts the aggregated ciphertext C j and extracts the aggregated data by executing the subsequent phases:

Step 1: CC decrypts the aggregated ciphertext by first retrieving the encrypted data C j . From Equation (8), take:

M= ijS M j ( B ij + a ij )modn (14)

The report ( 1+Mn ) ( h 1 ) i=1 w j α ij mod n 2 is still the ciphertext of the Paillier encryption system. It can be equated to g M R n mod n 2 with R= i=1 w j r ij and CC uses the tuple ( λ,μ ) to recover M : as

M=L( C j λ mod n 2 )μmodn (15)

Step 2: After decryption, CC uses a 0j to obtain ijS M j B ij as follows:

ijS M j B ij =M a 0j modn (16)

Step 3: CC Retrieve each aggregated data ijS M j b ij,d , d=1,2,,l using the decoding function.

The CC divides the binary representation of ijS M j B ij into l bit chunks of maximal length less than z , so that the aggregated data M ij = ijS M j ,d=1 l m ij,d can be written as

( ijS M j B ij ) 2 = ijS M j m ij,l ijS M j m ij,2 ijS M j m ij,1 (17)

2) Fault-Tolerant Data Reading: Upon receipt of the fault-tolerant report from FN j , CC initially authenticates the signature in accordance with the subsequent equation:

e( P, σ j )=e( p k j ,H( C j TSI D j ) ) (13)

However, the released M ( M= ijS M j ( B ij + a ij )modn ) contains only a subset of the a ij values, and thus we cannot use a 0j to eliminate the blind factor. Consequently, we utilize the a 0j generated by the KGC to obtain ijS M j B ij .

ijS M j B ij =M a 0j modn (18)

CC Retrieve each aggregated data ijS M j b ij,d , d=1,2,,l using the decoding function.

The CC divides the binary representation of ijS M j B ij into l bit chunks of maximal length less than z , so that the aggregated data M ij = ijS M j ,d=1 l m ij,d can be written as

( ijS M j B ij ) 2 = ijS M j m ij,l ijS M j m ij,2 ijS M j m ij,1 (19)

Thus, we can convert the aggregated data into a binary bit string, and then separate out the substrings of length z bits starting from the lower bit to retrieve the aggregated data for each dimension.

6. System Analysis

In EAMA, we presume that FNs and CC are honest and curious. Each FN precisely compiles encrypted multidimensional electrical usage data from the identical grid zone, while the CC effectively checks and decrypts the data that is provided. Both entities are interested in users’ power consumption trends and may seek to extract useful information. Furthermore, an internal attacker (e.g., a curious user) may attempt to exploit the secret key of the SM and other essential factors to get the original data from SMs located in other users’ residences.

1) Anonymity: During data transmission, the SM communicates with the FN via a pseudonym. During the information transmission session, SMs establish communication links with the FN utilizing temporary identity. As the parameter z l employs a randomization technique in the identity generation process, there exists no discernible association between several temporary identifications produced by the same device. Only KGC has the capability to restore the user’s true identity by analyzing the temporary IDs. To augment security and resist attackers from correlating high-precision electricity consumption data with user identities, the system periodically updates and manages the temporary identifiers of all SMs. This technique efficiently safeguards user identification information, hence achieving the anonymity objective of the scheme design.

2) Confidentiality: An outside attacker is presumed to possess the capability to eavesdrop on the public communication channel between the SM and the FN, thereby intercepting the encrypted message C ij . Nonetheless, due to the semantic security afforded by the Paillier encryption method, an adversary is unable to extract any meaningful information from the ciphertext, even if it is acquired. During the encrypted data reporting phase, each S M ij first encodes the l -dimensional data ( m ij,1 , m ij,2 ,, m ij,l ) into a binary string ( b ij,1 , b ij,2 ,, b ij,l ) , where each b ij,d = ( m ij,d ) 2 0 z( d1 ) , for d=1,2,,l . The private data is then set as m ij = d=1 l b ij,d . Under the modified Paillier encryption scheme, the corresponding ciphertext is generated as: C ij =( n B ij ¯ +1 ) h 1 α ij mod n 2 where B ij ¯ = B ij + a ij modn . Each SM ij holds a unique secret parameter a ij , which varies for each instance. Consequently, EAMA is resistant to selective plaintext attacks.

3) Privacy: In this framework, FN retains only the encrypted data of anonymous users without possessing the decryption key. CC possesses the decryption key but cannot access the fine encrypted information or the users’ true identities; and KGC, while aware of the users’ true identities, cannot acquire their detailed data. Even if FN and CC collude, CC can decrypt the encrypted data of individual users acquired from FN. Nonetheless, it is challenging to erase the secret parameter a ij , preventing the identification of the actual users behind the electricity consumption data. Consequently, despite the potential for cooperation between two parties, the method can nevertheless proficiently obscure real-time electricity usage data from being linked to the user’s true identity, so safeguarding personal privacy.

4) Leak Toughness: EAMA ensures that a group of collaborating users does not threaten the privacy of other users. When an attacker A seeks to compromise a user’s privacy, they must have access to personal data and the associated secret share x ij . In EAMA, the secret shares a ij n * produced by the KGC are allocated independently, indicating that the compromise of secret shares from a subset of users does not disclose those of other users. Assume that A successfully compromises w j 1 users associated with a certain FN j and acquires their secret shares a 1j , a 2j ,, a ( w j 1 )j . For these w j 1 users, Equation (1) can be reformulated as:

a 0j = i=1 w j 1 a ij + a ( w j 1 )j modn (20)

Despite this, the data privacy of the remaining users remains intact, as A does not possess the CC’s secret share of the sum a 0j nor the Paillier private key. Therefore, A can’t compromise a w j j . Consequently, with EAMA, A cannot access the private data of other users, irrespective of the number of hacked users. Even if the Paillier private key were to be obtained through logarithmic computation, A would still be unable to retrieve individual data contents due to the secure embedding of the secret shared value a ij within each ciphertext. Thus, in EAMA, encrypting power consumption data not only ensures leakage resilience but also preserves data confidentiality and the privacy of all SMs.

7. Performance Evaluation

Utilizing established cryptographic libraries, MIRACL [27] and PBC [28], we evaluate the expense of the cryptographic operations employed in EAMA and prior technologies. Our technique is predicated on the modified Paillier cryptosystem utilizing a 1024-bit key and a pairing operation within a base field of 160 bits. We utilize the standard curve secp160r1 for ECC.

Table 2 presents the findings acquired on a computer equipped with an Intel i7-8750H CPU operating at 2.2 GHz and 8 GB of RAM. It is important to recognize that multiplication in Z n 2 , modular addition, and hash operations are insignificant in comparison to exponentiation in Z n 2 and pairing operations.

Table 2. Time consumption for related operations.

Symbol

Definition

Time (ms)

T h

Time for general hash function

0.01

T H

Time for hashing to ECC point multiplication

0.02

T ex

Time for modular exponentiation

7.13

T mun

Time for regular multiplication

1.58

T pmn

Time for ECC point multiplication

0.32

T pa

Time for ECC point addition

0.34

T inv

Time for modular inversion

8.72

T bp

Time for bilinear pairing operation

16.81

7.1. Computation Cost

In EAMA, when S M ij produces its output, it necessitates T mu + T ex to generate C ij and T H + T pm to produce σ ij . Consequently, the whole computational expense on the SM side is represented by T mu + T ex + T H + T pm . Our approach considerably decreases the computing expense for the user. ( l represents multidimensional data.)

In data aggregation, upon receiving { C ij , σ ij ,TS,PI D ijl },i=1,2,,N from all the SMs, FN j performs n+1 bilinear pairwise operations and one ECC hash-to-dot for data verification. Subsequently, it consolidates the encrypted data and produces a signature, requiring 3n3 conventional multiplication operations to form the aggregated ciphertext, one ECC hash-to-point operation, and one ECC pointwise multiply operation to create a new signature on the aggregated the encrypted data. The overall computational expense of FN is expressed as ( n+1 ) T bp +( 3n3 ) T mu +2 T H + T pm . Within the Fog computing paradigm, Fog Nodes has greater processing capabilities than traditional nodes. Consequently, the aggregate procedure may be executed efficiently. Subsequently, we evaluate the computational expense of the current schemes with regard to each SM and AG individually. In the approach proposed by Zhang et al. [25], offer a data reporting method where in the computation is expressed as ω 1 m i1 + ω k+1 m i1 2 + ω 2 m i2 + ω k+2 m i2 2 ++ ω k m ik + ω 2k m ik 2 , each SM requires 3k multiplication operations, in addition to two exponentiation operations, one hash operation, and one multiplication to generate C T i . Generating the authenticator σ i requires two exponentiation operations, two hash operations, and one multiplication. Thus, the total computation costs on the SM side total 4 T ex +3 T h +( 3k+2 ) T mu . In Data Aggregation, after AG receives { C T i , σ i ,T },i=1,2,,n , from all SMs, it requires n1 multiplication operations to generate the aggregated ciphertext CT. It requires one exponentiation, one hash operation to compute τ n , n multiplication operations to generate ξ , and n exponentiation operations coupled with 2n1 multiplication operations to get σ . The total computational costs of AG are represented as ( n+1 ) T ex +( 4n2 ) T mu + T h . Using a comparable analytical method, Zuo et al.’s framework [23] indicates that the overall computational expense for each SM component is k T mu + T h +5 T ex , while the total computational expense for the AG component is ( n+1 ) T bp +( 4n4 ) T mu +( n+1 ) T h + T ex . Ultimately, the overall computational expense for each SM and aggregator in Boudia et al.’s framework [17] is 2 T ex +1 T H +1 T pm and ( n+1 ) T bp +( 3n3 ) T mu +2 T H +1 T pm , respectively.

A comparison of the computational overhead in terms of SMs is presented in Figure 3.

7.2. Communication Cost

Given that SMs are equipped with resource-constrained storage and computational devices, each SM encrypts l categories of data and transmits them to the respective aggregator. The communication cost can be categorized into two components: communication from the SM to the FN and communication from the FN to the CC. Given that SMs are equipped with resource-constrained storage and computational equipment, we concentrate on assessing the communication

Figure 3. Computational overhead on the SM side.

expenses from SMs to FN. In EAMA, the variables, { C ij ,TS, σ ij ,PI D ijl } are sent from the SM to FN j , where C ij n 2 and σ ij G 1 . To provide uniformity, we designate the size of the PI D ijl and timestamp as 160 bits and 64 bits, respectively, across all methods. Consequently, the transmission expense from SM to FN amounts to 2048+64+160+160=2432 bits. Subsequently, we examine the correspondence from FN to CC. In EAMA, the elements, { C j , TS, σ j ,I D j } are transmitted from FN j to CC, where C ij n 2 and σ ij G 1 . The communication cost from FN to CC is calculated as 2048+64+160+64=2336 bits. In the system proposed by Zhang et al. [25], each S M i transmits { C T i , σ i ,T } to AG. Consequently, the transmission expense from n SMs to AG amounts to ( 2048+512+64 )n=2624n bits. In the approach proposed by Zuo et al. [23], each S M i transmits { I D i , C i a , C i b , T i , σ i ,p k i } to AG. The transmission expense from n SMs to AG is ( 64+512+512+64+1024+512 )n=2688n bits. In the framework suggested by Boudia et al. [17], { C ij ,I D ij ,TS, σ ij } is transmitted from S M ij to the respective FN, where C ij represents the ciphertext of the Paillier encryption system and σ ij denotes a signature, resulting in a total communication cost of ( 2048+64+64+160 )n=2336n bits. Ultimately, the comparison of communication costs is illustrated in Figure 4.

To evaluate our enhanced scheme against current ones, we established n=400 as the real value and performed a comprehensive comparison of the communication costs from n SMs to the FN, as illustrated in Figure 4. The figure indicates that the communication cost of our design is inferior to that of other schemes, with the exception of the scheme proposed by Boudia et al. [17]. While the transmission cost from SM to FN in the strategy proposed by Boudia et al. is somewhat cheaper than ours, their scheme lacks pseudonyms, potentially compromising its security.

Figure 4. Communication overhead on the SM side.

8. Conclusion

In this paper, we propose an efficient and anonymized multidimensional data aggregation scheme for fog computing smart grids, named EAMA. By leveraging an improved Paillier encryption scheme, our approach enhances encryption performance. Furthermore, the scheme incorporates a pseudonym mechanism for SMs, enabling FNs to aggregate data based on pseudonyms before retrieving the final aggregated results. The security analysis concludes that the approach guarantees data privacy, confidentiality, integrity, and authentication. The performance analysis underscores the scalability benefits of EAMA, the efficacy of its fault-tolerance mechanism, and its cost-efficiency in both computing and communication. Furthermore, EAMA supports inquiries beyond mere summing, rendering it appropriate for the application needs of smart city smart grids.

Funding

The work was supported by the National Natural Science Foundation of China under Grant 12061027, by the Natural Science Foundation of Guangxi of China under Grant 2018GXNSFBA281019, by the Doctoral Research Foundation of Guilin University of Technology under Grant GUTQDJJ2018033, and by the Opening Fund of Key Laboratory of Cognitive Radio and Information Processing, Ministry of Education under Grant CRKL210206.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Zhang, Z., Deng, R., Yau, D.K.Y., Cheng, P. and Chen, J. (2020) Analysis of Moving Target Defense against False Data Injection Attacks on Power Grid. IEEE Transactions on Information Forensics and Security, 15, 2320-2335.
https://doi.org/10.1109/tifs.2019.2928624
[2] Chouikhi, S., Esseghir, M. and Merghem-Boulahia, L. (2023) Energy Consumption Scheduling as a Fog Computing Service in Smart Grid. IEEE Transactions on Services Computing, 16, 1144-1157.
https://doi.org/10.1109/tsc.2022.3174698
[3] Yigit, M., Gungor, V.C. and Baktir, S. (2014) Cloud Computing for Smart Grid Applications. Computer Networks, 70, 312-329.
https://doi.org/10.1016/j.comnet.2014.06.007
[4] Liu, J., Weng, J., Yang, A., Chen, Y. and Lin, X. (2020) Enabling Efficient and Privacy-Preserving Aggregation Communication and Function Query for Fog Computing-Based Smart Grid. IEEE Transactions on Smart Grid, 11, 247-257.
https://doi.org/10.1109/tsg.2019.2920836
[5] Ni, J., Zhang, K. and Vasilakos, A.V. (2021) Security and Privacy for Mobile Edge Caching: Challenges and Solutions. IEEE Wireless Communications, 28, 77-83.
https://doi.org/10.1109/mwc.001.2000329
[6] Kabalci, Y. (2016) A Survey on Smart Metering and Smart Grid Communication. Renewable and Sustainable Energy Reviews, 57, 302-318.
https://doi.org/10.1016/j.rser.2015.12.114
[7] Rossi, B. and Chren, S. (2020) Smart Grids Data Analysis: A Systematic Mapping Study. IEEE Transactions on Industrial Informatics, 16, 3619-3639.
https://doi.org/10.1109/tii.2019.2954098
[8] Gong, Y., Cai, Y., Guo, Y. and Fang, Y. (2016) A Privacy-Preserving Scheme for Incentive-Based Demand Response in the Smart Grid. IEEE Transactions on Smart Grid, 7, 1304-1313.
https://doi.org/10.1109/tsg.2015.2412091
[9] Paillier, P. (1999) Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In: Stern, J., Ed., Advances in CryptologyEUROCRYPT’99. Lecture Notes in Computer Science, Springer, 223-238.
https://doi.org/10.1007/3-540-48910-x_16
[10] Kataray, T., Nitesh, B., Yarram, B., Sinha, S., Cuce, E., Shaik, S., et al. (2023) Integration of Smart Grid with Renewable Energy Sources: Opportunities and Challenges—A Comprehensive Review. Sustainable Energy Technologies and Assessments, 58, Article 103363.
https://doi.org/10.1016/j.seta.2023.103363
[11] Zhang, K., Lu, L., Zhao, J., Wei, L. and Ning, J. (2024) Secure Multi-Asks/bids with Verifiable Equality Retrieval for Double Auction in Smart Grid. Peer-to-Peer Networking and Applications, 17, 3255-3268.
https://doi.org/10.1007/s12083-024-01744-5
[12] Ayub Khan, A., Ali Laghari, A., Rashid, M., Li, H., Rehman Javed, A. and Reddy Gadekallu, T. (2023) Artificial Intelligence and Blockchain Technology for Secure Smart Grid and Power Distribution Automation: A State-of-the-Art Review. Sustainable Energy Technologies and Assessments, 57, Article 103282.
https://doi.org/10.1016/j.seta.2023.103282
[13] Lyu, L., Nandakumar, K., Rubinstein, B., Jin, J., Bedo, J. and Palaniswami, M. (2018) PPFA: Privacy Preserving Fog-Enabled Aggregation in Smart Grid. IEEE Transactions on Industrial Informatics, 14, 3733-3744.
https://doi.org/10.1109/tii.2018.2803782
[14] Gope, P. and Sikdar, B. (2020) An Efficient Privacy-Friendly Hop-by-Hop Data Aggregation Scheme for Smart Grids. IEEE Systems Journal, 14, 343-352.
https://doi.org/10.1109/jsyst.2019.2899986
[15] Gope, P. and Sikdar, B. (2019) Lightweight and Privacy-Friendly Spatial Data Aggregation for Secure Power Supply and Demand Management in Smart Grids. IEEE Transactions on Information Forensics and Security, 14, 1554-1566.
https://doi.org/10.1109/tifs.2018.2881730
[16] Saleem, A., Khan, A., Malik, S.U.R., Pervaiz, H., Malik, H., Alam, M., et al. (2020) FESDA: Fog-Enabled Secure Data Aggregation in Smart Grid IoT Network. IEEE Internet of Things Journal, 7, 6132-6142.
https://doi.org/10.1109/jiot.2019.2957314
[17] Merad-Boudia, O.R. and Senouci, S.M. (2021) An Efficient and Secure Multidimensional Data Aggregation for Fog-Computing-Based Smart Grid. IEEE Internet of Things Journal, 8, 6143-6153.
https://doi.org/10.1109/jiot.2020.3040982
[18] Li, F., Luo, B. and Liu, P. (2010). Secure Information Aggregation for Smart Grids Using Homomorphic Encryption. 2010 1st IEEE International Conference on Smart Grid Communications, Gaithersburg, 4-6 October 2010, 327-332.
https://doi.org/10.1109/smartgrid.2010.5622064
[19] Lu, R.X., Liang, X.H., Li, X., Lin, X.D. and Shen, X.M. (2012) EPPA: An Efficient and Privacy-Preserving Aggregation Scheme for Secure Smart Grid Communications. IEEE Transactions on Parallel and Distributed Systems, 23, 1621-1631.
https://doi.org/10.1109/tpds.2012.86
[20] Li, S., Xue, K., Yang, Q. and Hong, P. (2018) PPMA: Privacy-Preserving Multisubset Data Aggregation in Smart Grid. IEEE Transactions on Industrial Informatics, 14, 462-471.
https://doi.org/10.1109/tii.2017.2721542
[21] Mustafa, M.A., Cleemput, S., Aly, A. and Abidin, A. (2019) A Secure and Privacy-Preserving Protocol for Smart Metering Operational Data Collection. IEEE Transactions on Smart Grid, 10, 6481-6490.
https://doi.org/10.1109/tsg.2019.2906016
[22] Merad Boudia, O.R., Senouci, S.M. and Feham, M. (2017) Elliptic Curve-Based Secure Multidimensional Aggregation for Smart Grid Communications. IEEE Sensors Journal, 17, 7750-7757.
https://doi.org/10.1109/jsen.2017.2720458
[23] Zuo, X., Li, L., Peng, H., Luo, S. and Yang, Y. (2021) Privacy-Preserving Multidimensional Data Aggregation Scheme without Trusted Authority in Smart Grid. IEEE Systems Journal, 15, 395-406.
https://doi.org/10.1109/jsyst.2020.2994363
[24] Alsharif, A., Nabil, M., Sherif, A., Mahmoud, M. and Song, M. (2019) MDMS: Efficient and Privacy-Preserving Multidimension and Multisubset Data Collection for AMI Networks. IEEE Internet of Things Journal, 6, 10363-10374.
https://doi.org/10.1109/jiot.2019.2938776
[25] Zhang, S., Chang, J. and Wang, B. (2023) A Multidimensional Data Aggregation Scheme of Smart Home in Microgrid with Fault Tolerance and Billing for Demand Response. IEEE Systems Journal, 17, 4639-4649.
https://doi.org/10.1109/jsyst.2023.3286468
[26] Damgård, I., Jurik, M. and Nielsen, J.B. (2010) A Generalization of Paillier’s Public-Key System with Applications to Electronic Voting. International Journal of Information Security, 9, 371-385.
https://doi.org/10.1007/s10207-010-0119-9
[27] Shamus (2014) Multiprecision Integer and Rational Arithmetic C/C++ Library (Miracl).
[28] Lynn PBC Library.
https://crypto.stanford.edu/pbc/

Copyright © 2025 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.